Prediction: Change at HBO

Prediction: Eric Kessler will change his views on cable cutters, or he will no longer works at HBO. ??Only a matter of time. ??

He epitomizes how out of touch media companies are with technology. ??This quote is:
"At that time, Kessler also said his company sees cable-cutting as no more than a temporary austerity measure that will cease as soon as the economy takes a turn for the better."

Anyone have a quote from telecoms a decade ago talking about cell phones not being a thread to phone land lines?

Posted in Uncategorized | Leave a comment

Applicable University

CS 193H High Performance Websites

Good work Stanford! Producing graduates with applicable skills — sovaluable.

I’ve thought Universities needed more applicable tracks for a long time. I’ve head people argue against me, too, and I’ve never felt theirarguments held much water. One that’s been brought up a few times is that“they teach concepts applicable to lots of scenarios, not specificimplementations”. I think you can learn how real-world people have appliedthe concepts to real tools, debate their decisions, and come off with awell-rounded understanding of the theory plus an understanding of aspecific example. As an employer, I’d prefer you to know how to tune agarbage collector — any garbage collector — than not. The concepts portwell, but if you’ve never done it before, it takes a long time.

I think the real reason there isn’t more applicable classes at a lot ofUniversities is because the profs can’t teach them. A lot of profs havenever worked. They got a Bachelors, a Masters, a PhD, and now they teach. Or else when they did work it was decades ago, or maybe in a researchcenter, or wherever — regardless, I don’t believe there are manyprofessors around with Twitter, Facebook, Amazon, or Google experience. Bythe end of their PhD program, they’re under-qualified to teach anythingpractical, because they may not know what is practical and what isn’t. (Tobe sure, this does not apply to all profs.)

Don’t get me wrong, PhD work is incredibly valuable, and leads to amazingreal-world implementations of breakthrough theories, but the professionalimplementors are generally underrepresented at Universities. Studentsgraduating with a Masters or a PhD are often less qualified for practicalsoftware engineering jobs. Stanford seems to be addressing this, and Ihope other Universities follow.

Other classes I’d love to see:

CS 19X Latency-Constrained, High-Volume Services

  • load testing
  • tuning the JVM
  • detecting and determining bottlenecks
  • scale out, not up
  • estimating hardware needs
  • service redundancy
  • DB sharding
  • DB failover strategies and implications
  • DB growth analysis and strategies
  • seamless deployments and rolling back
  • threading horror stories
  • server security and authentication

CS 19Y Unix Administration for Developers

  • shell programming and the profile/zshrc/etc
  • applications run as users
  • what’s listening on that port? (and network strategies)
  • server tuning and hardware selection
  • logging
  • AWS-applied (from zero to service stack)
  • backups, archives, storage, and recovery
  • DNS
  • connectivity
  • keys, passwords, tunneling and other security concerns

CS 19Z Practical Development Practices and Tools

  • source control
  • unit testing
  • integration testing
  • continuous build & test, code coverage
  • dependency management 1, POM Hell, an introduction to Maven
  • dependency management 2, bundler internals
  • deployments
  • REST vs SOAP
  • real-world commenting, READMEs, and technical blogging
  • bug tracking & sprint planning
Posted in Uncategorized | 1 Comment

What’s going on in your app?

I just read a blog post off hacker news: Why loading third party scriptsasync is not good enough. It reminded me of someone I used to work with at Amazon who wouldregularly find errors in our applications. This was quite a feat at Amazonbecause we instrument everything. We have regex’s constantly parsing logslooking for errors, we have a dozen kinds of monitors collecting hostmetrics, server metrics, client metrics, business metrics, coffeetemperature metrics, etc. all constantly checking “is your cpu load high?”,“do you have enough free memory?”, “how many times did you show pictures ofthe Twilight case?”, etc.

This one engineer (on a team of exceptional engineers) was consistently theonly one to find errors. It was definitely very healthy for the team but…. engineers secretly hate this because, by definition, finding errorsmeans he’s pointing out faults in your work. Managers less secretly hatethis because it means he’s ‘creating’ high priority work that getsaddressed ahead of their projects.

So with all these metrics and monitors on a team of high achievers, how didthis one person on our team keep finding errors? He looked at the logs.

That was his secret weapon; reading logs! It’s like grade 1 of servicemaintenance. With all our monitoring, regex’s, and features we thought wewere too good to ‘just’ read logs. The rest of the team would releasefeatures, put regex’s to detect our errors, trace a few requests afterlaunching, and then move on to the next project. I honestly don’t know howmuch time he spent on it, but every week or two, he’d come in and explainhow our programs were messing something up.

Things like:

  • requests to a dependency fails. We monitor overall failures, and acceptfailures of less than 0.1% (just hiccups and connection problems, right?).Turns out our dependency never worked for 0.1% of our customers.
  • we have a dependency known to have errors, but retries often succeed. Wewill retry every request once before raising an error. Our dependencymakes a change which we don’t notice, but our retry rate goes from 2% to50%.
  • you have ‘targeting’ params which you consume if available (i.e. the httpreferrer header). You make a change which loses this data in the course ofa request and now you’re never using it to target.

There were three morals of this story:

  1. Drill down into your metrics and understand where they are coming from(and their deficiencies)
  2. Your monitoring will never be perfectly reliable — you regularly needto just randomly re-verify things are working
  3. Every time you catch a problem, install the proper monitoring to makesure it never happens again

In my experience, the most likely error is one you’ve seen before.

Posted in Uncategorized | Leave a comment

Applying for a job?

Applying for a job as a software engineer?

The odds are you have a bad resume (since >50% of the resumes I’ve seen arebad)

The Resume


No! Waste of space. It can only hurt. Your objective is to get a phonecall or email. Don’t apply for jobs you don’t want. If you’re unsure,just apply like you want it and ask questions in the first conversation.


2 pages max, 1 page if possible


Prefer PDF — it’s more universal. Windows is losing market-share in development circles. Word is not universal.

Check the job description — if they indicate a format, follow it.

If you do use MS Word:

  • email it to yourself and verify it looks good in Google Docs (I’m notgoing to download it)
  • email it to a friend with a Mac and verify it looks good (some people dodownload it)


  • CS Degree at the top
  • Certificates at the bottom or omitted
  • Github account at the top (as a link)


  • One entry per company
  • Include years worked there
  • Multiple roles => more bullet points
  • What was your role? What did you do?


  • SMALL LIST (wow, you know XML? Really? Cause that’s hard to find andhard to teach … not)
  • technology stacks and platforms are enough
  • don’t list things you would be uncomfortable interviewing in (with 1 daysnotice)
  • Seriously! You list it, the interviewer can ask about it, and expect you to code in it

Good Examples

  • Java: Tomcat on AWS with MySQL
  • Ruby: Rails on Heroku with Postgres
  • Web UI: HTML/CSS/JS on PHP

Bad Examples

  • Java: Tomcat, Jetty, Maven, Ant, JUnit, XML, SOAP, Hibernate, JSP, blahblah blah
  • Ruby: Rails, Rake, JSON, Devise, Cucumber, Webrat, RSpec, VCR,FactoryGirl, ….

Small caveat

If you’re applying to big companies that are tech ignorant, you might needa laundry list to get past HR.

Extra Curriculars

  • yes if you’re just graduating for college
  • yes if they’re significant
  • for non-new grads, at most one line (unless you’re putting in 5+ hoursper week)
  • don’t list every club you’ve ever been a part of
  • don’t list activities that include demographic information (Gay andLesbian orgs, Christian Missionaries, etc)

Don’t list things that some people “don’t get”. No need to mention yourinvolvement with D&D, Video Game communities, or Twilight fan clubs

Cover Letter

Generally not required.

When is it a good idea?

  • when you don’t have a CS degree
  • when you’re applying out of your depth (UI developer applying to be DBA?)
  • when you really want the job and can speak intelligently about thecompany

When is it a bad idea?

  • when you copy paste the same cover letter every time
  • when you have nothing to say (i.e. you’re summarizing your resume)


Active on the job market? You should have one of three things on your resume

  • a great school (Stanford, MIT, Waterloo, UIUC, Carnegie Melon, etc)
  • a great company (Google, Amazon, Apple, Facebook, or a known startup)
  • a Github account with a bunch of stuff in it
Posted in Uncategorized | Leave a comment

Learn to Complain

There’s a very undervalued skill I’ve observed over the years writing software, and that is complaining (to be valuable, you do have to be good at it).  I’ve observed it in Amazon’s trouble ticket system, in open source communities, asking questions on stack overflow, and pretty much anywhere else engineers interact collaboratively.  Learning to complain accurately and precisely is an incredibly valuable skill and is a hallmark of great software engineers.

Creating and working tickets at Amazon was great experience for developing this skillset, and one I greatly under-appreciated at the time.  At Amazon, every team has a ticket queue (or 5), and every problem can be assigned to a ticket queue.  Tickets can move between queues on a whim, but tickets must always remain in a queue and every queue has an owner — thus, every ticket has an owner.  Additionally, there’s pressure from management to have fewer tickets in your queue at all times, and to have fewer ‘high severity’ tickets pass through your queue.  The result of this environment is that when you open a ticket against another team, you want to be careful it doesn’t come back to your own queue (more on how this works below).  Your software quality is measured indirectly through how many tickets you receive.  (It’s more complicated than that, but the way I’ve seen it used it is a valid proxy.)

Lets take an example.  I’m working on some software, and I find what I believe to be a defect in someone else’s code (Gasp!).  I might open a ticket against the owner of that software to request a fix.  Should they determine that their systems are working as designed (read as “you’re the who made the mistake, check your own code”), they might reassign the ticket back to my queue — after all, they have determined they have no action to take to resolve the supposed issue.  Yup, now both our metrics show that ticket passing through it, but I own the next step.  This pattern results in at least a moral incentive to not have tickets you create to come back to your own queue (from firsthand knowledge, it is rather embarrassing when it happens).  The only way to get them to ‘accept’ the defect report is to demonstrate or provide evidence of the error.  The result is that you learn to write great tickets.  This same skill applies to (some) questions on StackOverflow or issues on GitHub.  Being able to precisely and accurately diagnose the issue by identifying expected behavior and contrasting it with the observed behavior takes time and diligence.  Furthermore, it requires you to eliminate possible sources of error in code or systems you control before concluding the error lies outside of your domain.

Great complaints usually include enough detail to find and trace the issue in question (request id for log searching along with approximate time and date of request), logs clearly demonstrating the symptom in question, and/or code that can deterministically reproduce the issue.  (If it’s non-deterministic, it’s an order of magnitude harder to diagnose … good luck).  

Once you’ve gone through this exercise a few times (and had other people politely reject your complaints due to lack of information), you start to get pretty good at creating tickets.  Also, the exercise is valuable in itself, and often leads to a resolution of the issue before you can complain.  On numerous occasions I’ve started asking a question on StackOverflow or on Github and in the course of asking the question and creating reproduction steps I have solved my own problem.  I will see a behavior that appears to be an error in someone else’s code; I will start working on an issue; go looking for an example, some code, or logs to make my point, and discover the issue lies elsewhere.  I’d say this happens 80% of the time I start a question or issue.

In my interactions with other engineers, it’s become clear that the ability to speak precisely about what is happening in and about code is correlated with those able to complain clearlyt.  It’s hard to establish causality, but improved complaint quality appears to be highly correlated with engineering talent.  Not sure if I could actually screen for it, but it’s an interesting question.

On a final note, while on one of my previous teams, we used to joke that we would get a large stuffed bear.  Before bothering another engineer for help, you would have to get the stuffed bear and explain your issue to the bear.  If you were unable to demonstrate your issue to the bear, you couldn’t bother another engineer.  This would simply force the thought process of stepping back from an issue to a higher level, explaining the inputs and outputs, and tracing the code through holistically — things like saying out loud what different variables contained.  90% of the time, you’d find your issue was a misassigned variable, or an instance of a class you were not expecting, or some other ‘menial’ error.  

I swear, if we had actually gotten the bear, he would have immediately become the most helpful member of the team and surely improved our ability to troubleshoot issues.

Note: Just so there’s no confusion, I’m discussing one of many metrics used by Amazon internally, and even this one isn’t applied as bluntly as I might have described.  The abstraction works for my use case, but please don’t jump to conclusions about Amazon’s management practices based on this.  Also worth noting, it’s been more than a year since I last worked at Amazon, so it’s possible the anecdote is no longer accurate.   
Posted in Uncategorized | Leave a comment

Waterloo fails a lot of students, and that’s okay

Part 1: Waterloo fails a lot of studnets majoring in Computer Science.

A lot!

When I was attending, you needed to maintain a 65% major average, and every class kept it’s averages under 70%.  In my first year, I was enrolled in Software Engineering (the major), which was basically comprised of the top people from Computer Science and Computer Engineering in one class.  We had a midterm once with an average around 78%, and were told it was too high.  I’ve also heard profs say that 72% is too high for an average.  Averages under 60% are the only ones I’ve ever heard of as being too low.  From what I gathered as an undergrad, anywhere between 60-70% was okay, and profs tended to shoot for ~67%.

That’s an average only slightly higher than the required major average.  The marks were not normally distributed, so the median was actually lower.  There are a group of exceptionally bright people at Waterloo, that create a small hump somewhere up in the 80-90% range, then the larger hump is right around 60%.  This is certainly generalizing, but I believe it to be close to truth for most classes.  (A 50% is a pass.  It’s an ‘honors program’, so if you graduate, you get honors — you can’t graduate with CS and not get honors at Waterloo.  In theory, this makes a 65% equilvaent to an 80% at a lot of other schools.)  

Every class I took was essentially graded on a curve.  If the midterm was easy, you’d get slaughtered on the final.  If everyone failed the midterm, the final would be not so bad.  I had many profs resort to, what I consider, underhanded techniques to lower marks.  Assignments without marking ledgers, exams where 25% of the marks came from 1/2 of 1 lecture, etc.  

The result of failing about 5% of the class every term results in survival of the fittest.  Waterloo CS grads have a good reputation partially because they’re the top half of their class by definition of being a graduate.  In retrospect, this actually makes a Waterloo CS degree that much more valuable, and improves the reputation of graduates that much more.  The only downside is that Waterloo’s incredible bias towards lower marks (especially when compared to our neighbors in the US) makes it that much harder to go into grad work.  Your 65% at Waterloo might be equivalent to an 80% at most US schools, but no one knows it and so your applications will just get filtered out of consideration.

Part 2:  Am I right?

So, here we are.  I’m years out of school, and I’ve had this intuitive belief that Waterloo CS fails a lot of people, certainly way more than their advertised 12% drop out rate.  I’ve always believed the number to be closer to 40%. (

So I went and found the number of students enrolled as FTE in computer science:
And the number of degrees awarded in computer science:

In the table below, student count is # of full time students in the start year.  By the FTE definition, this should capture every first year student in CS.  The Degrees Awarded is in the same major (CS) from 5 years forward.  So the 2005/2006 start year has the degrees awarded in 2010, which is the latest complete data available.  

Start Year Student Count Degrees Awareded Percentage
2005/06 280.8 222 79.06%
2004/05 416.8 215 51.58%
2003/04 548.2 339 61.84%
2002/03 667.5 354 53.03%
2001/02 604 448 74.17%
2000/01 668.5 514 76.89%
1999/00 618.5 462 74.70%
1998/99 548.5 421 76.75%
1997/98 641 387 60.37%
1996/97 513 310 60.43%

The average CS graduation rate over 10 years was 66.88%.  This is almost exactly failing 5% of the class every term, or failing 10% of the class every year.  Which works out to roughly what I believed intuitively.  

Some notes on the method:  

I tried looking through year-by-year to see if I could get a handle on the annual drop out rate, but it wasn’t really possible because the definition of a full time student is not compatible with the co-op program (in which >80% of CS majors participate); and because the program has 4 year’s worth of classes, but is a 5 year degree.  So we just look at starts and degrees.  Also, regular students (not co-op) graduate 1 year sooner than their co-op counterparts, and I also want to capture (without penalty) those graduating late.  For these reasons, I wouldn’t interpret 2004 to have been the hardest year to start and 2005 as the easiest, but just that there was probably some skew in numbers here between co op and regular (though the they got much more selective in 2005, which could also explain the graduation rate.  

It’s worth pointing out that in 2001, the Software Engineering program started, which potentially drew students to it who would otherwise have been in the CS program.  

Around 2005, Waterloo started offering a Bachelor of Computer Science to which it was easy to transfer from the existing CS program.  It’s unclear to me how to account for the difference between the new CS degree and the old BMath/CS.  The interface for the data did not make it clear to me how to distinguish them.  That said, the Software Engineering program took in 103 students in 2004, and then awarded 52 degrees (50% success rate).  I suspect transfers from that program may have goosed the 2005 graduation rate numbers.  In theory, you fail out of SE, and finish a CS degree 1-2 terms behind your start year.  I don’t have the data to substantiate this theory.

Since students transfer from Software Engineering into CS, but CS students (generally) cannot transfer the other way, it’s possible I’ve overestimated the graduation rates (at least as they apply to your chance of graduating, given that you’re in first year.).  On the other hand, if Waterloo is accounting for the BCS separately, my numbers could be off — maybe the people who started in ’02 and ’03 switched to the BCS and earned that degree?  B
ut then, how do I account for the difference between BMath/CS and BCS undergrads?  Did the BCS take over the ‘CS’ tag startin in ’05?  Are the BMath/CS undergrads now lumped in with the BMath kids?  And finally, I’m not sure how students failing out during first year impact the numbers.  They definitely existed, but I’m getting my first year counts based on the number of students who took 2 full terms in their first year.  

Since these questions are still floating around, it’s probably best to take my result with a grain of salt.  However, given that (AFAIK) there was a lot more program stability prior to 2001, and assuming that Waterloo hasn’t fundamentally changed its policies on grades and pass rates, it would seem that even if my conclusion is not significantly far from reality.

And reality is that it’s damn hard to get a degree from Waterloo with the words Computer or Software printed on it, and that’s okay.

Posted in Uncategorized | 4 Comments

The Math class I never had

As I get deeper into AI, I realize that there was a math class I never had which would have been incredibly valuable as an undergrad. ??This class should have been day 1 of my undergrad, perhaps repeated every year, and certainly addressed for 20-30 min of each math course I took. ??That class would simply be a survey of mathematics, or maybe just, math from 10 000 feet.

As I dig deeper into unsupervised learning approaches, the solutions combine statistics with calculus and also require algebra to resolve. ??The intuition for many of the solutions comes from graph theory and geometry. ??Practical approaches require computational mathematics, statistics again (for estimations and acceptable error), and signal processing. ??Finally, truly practical applications often require computational resources and data stores that are enabled by (to be grossly general) computer science while delivering them on schedule and in a maintainable state is software engineering. ??

I disliked most of my math classes through college. ??Many of my professors were very poor teachers which was compounded by their poor command of the English language. ??The courses were poorly organized, exams poorly written, and notoriously hard to study for. ??As an example, my Stats 231 (stats 2 for math majors) final exam was worth up to 80% of my mark, but marked out of just 38. ??It was written by another prof, and required the cumulative learnings from stats 1 and stats 2, but advertised as not retesting material from stats 1. ??A very unpleasant experience at the time, but quite successful at failing at least 10% of the class, which is about right. ??(For reference, I started with somewhere in the neighborhood of 1000 students in my major, and somewhere around 600 graduated.)

I feel that keeping the bigger picture in view — applications of the math we were learning, or maybe starting with 1-day intro to AI or other advanced topics and just point out the math needed to go into them — would have been immensely valuable.
Posted in Uncategorized | 2 Comments