Category Archives: Programming and Software
Least Squares Linear Regression with Python
Some python examples from UCSD stats 650, Fall 2014
Facebook isn’t really your friend
The Social Graph is Neither (Pinboard Blog).
You might almost think that the whole scheme had been cooked up by a bunch of hyperintelligent but hopelessly socially naive people, and you would not be wrong. Asking computer nerds to design social software is a little bit like hiring a Mormon bartender. Our industry abounds in people for whom social interaction has always been more of a puzzle to be reverse-engineered than a good time to be had, and the result is these vaguely Martian protocols.
The Corporate IT Hierarchy
- Database Admins
- Project Managers
- Software Engineers
- Business Analysts
- Configuration Management
- Network admins
- Quality Assurance
Entity Framework: the OMG ORM?
ORM (Object-Relational Mapping) technology has been around a while now, and has moved from niche to mainstream along with code generation, unit testing, and agile development. ORM, in a nutshell, allows a developer to link object models to relational database schemas, eliminating the impedance mismatch inherent between relational databases and object-oriented programming languages. ORM can effectively eliminate the tedium and clumsiness of writing low-value SQL queries and can provide some pretty significant improvements in the deployment process. I don’t want to extoll the virtues of ORM too much here, this article will assume you are familiar with the concepts. Popular tools like Ruby On Rails’ ActiveRecord, and Java’s Hibernate are widely used implementations that can give you more background on the technology as well.
Microsoft has recently released Entity Framework 1.0 (EF) in Visual Studio 2008 SP1, it’s first, albeit very late, entry to the ORM world. I have several years experience with open-source NHibernate, a .NET port of Hibernate and the standard ORM technology for the .NET platform. While not perfect, NHibernate has been an essential workhorse in my toolkit, and one that I have leveraged to great success in both simple and complicated projects. I decided to take a look into EF to see how it compared to NHibernate.
The biggest and most obvious difference is that EF has a very slick UI built right into Visual Studio.NET. This UI allows you to use a wizard-style interface to select database objects and generate the entity model with a few clicks. Changes to friendly names and collections are easily done with a simple click and rename.Foreign Key relationships are automatically translated into object collections with no coding necessary. Under the hood this is all code generation – in fact an incredibly complicated set of C# classes that cannot and must not be modified by hand.
NHibernate, by contrast, relies on the much-maligned XML configuration file for it’s mappings. These files are well-known for being antagonistic to new users, but with some experience, are reasonably easy to get down. Visual mapping tools and codegens do exist (I have used a customized MyGeneration template for years) although they are not nearly as polished as the EF designer in Visual Studio 2008.
Once I had taken a spin through the EF designer and generated some classes, it was time to perform some basic CRUD operations against my database. EF does not provide pre-built CRUD code, instead it relies on Microsoft’s new LINQ-to-Entities syntax to allow a C#-native, SQL-style syntax for querying the entity model. Writing code to get an object by it’s ID via LINQ is very simple, and writing it in C# with full Intellisense support and syntax highlighting is a nice feature.
NHibernate has a query language too, although not native to C#, it is essentially the same as LINQ, a pseudo-SQL language called HQL. For much day-to-day work there is not much of a difference in the query language aspect between these two ORM’s, but LINQ definitely has the advantage here. The recently released LINQ-to-Hibernate provider should close this gap considerably.
Entity Framework does a good job of hiding the database, perhaps too good of a job. While NHibernate just needs a connection string to access a database, EF uses a set of special files to store the model and it’s mappings. Managing these files and their connection strings can be troublesome when accessing the model from separate projects (i.e. a unit test project and a web app). Also, NHibernate exposes the database connection a bit more explicitly, exposing a Session object and offering numerous ways to control the session, as well as some fairly deep features such as lazy loading, caching, and various collection types which are not present in EF.
The main challenge with EF becomes apparent when you attempt to build an object model beyond the basic object-per-table paradigm the EF GUI exposes. If you want to create a rich object model with modern OO techniques of aggregation, composition, generalization and specialization, it will be very hard to do with the EF designer. Dropping out of the designer presents a level of complexity much deeper than NHibernate’s config files.
EF’s preference for table-per-entity design creates a particular issue with legacy databases, especially databases that don’t expose good key structures (Peoplesoft is a particular example), or multiple database sources. This limitation, in my opinion, is the biggest problem with EF.Most modern software consumes multiple data sources – XML files, databases, web services, file system resources. It is usually beneficial to wrap these resources into a business entity model and not expose the underlying storage. Since NHibernate is basically just a mapping, it is less intrusive into your domain objects, and is less dependent on a ‘database-first’ approach to building out the entity model.
Ultimately, it would appear that EF, while a good first step, is probably not going to unseat NHibernate as the ORM of choice in the .NET framework. While it is an acceptable solution for most basic, CRUD heavy situations (and there are lots of them), it is probably not going to be useful to the people that often use ORM – enterprise developers working across varying data sources looking to simplify their data access into a comprehensible model. It will take time, and potentially several versions, for EF to get to where it needs to be. NHibernate has the inertia, and the large pool of experienced users that is very crucial in these types of tools.
However there is no question EF has incredible potential. Questions persist about NHibernate’s future as it has stayed at v2 for many years. Stability is not a bad thing in my opinion, but Microsoft will catch up fast as they have the resources to do so.Also interesting is Microsoft’s apparent PR blunder around Entity Framework. Announcement that EF would be the data access technology of choice going forward created a backlash around the simpler and more mature LINQ-to-SQL, which had already been in the wild for some time and was being used my many developers. Ultimately we will have to wait until 2010 and .NET 4.0 to see if EF is truly going to be the tool that will wean so many developers off of sprocs and ADO.NET code.
Work has been proceeding in order to bring perfection to the crudely conceived idea of a machine that would not only supply inverse reactive current for use in unilateral phase detractors, but would also be capable of automatically synchronizing cardinal grammeters. Such a machine is the ‘Turbo-Encabulator’.
The original machine had a base-plate of prefabulated amulite, surmounted by a malleable logarithmic casing in such a way that the two spurving bearings were in a direct line with the pentametric fan.Â … The main winding was of the normal lotus-o-delta type placed in panendermic semi-boloid slots in the stator, every seventh conductor being connected by a nonreversible trem’e pipe to the differential girdlespring on the ‘up’ end of the grammeters.
Forty-one manestically spaced grouting brushes were arranged to feed into the rotor slipstream a mixture of high S-value phenylhydrobenzamine and 5% reminative tetryliodohexamine. Both of these liquids have specific pericosities given by P = 2.5C.n^6-7 where n is the diathetical evolute of retrograde temperature phase disposition and C is Cholmondeley’s annular grillage coefficient. Initially, n was measured with the aid of a metapolar refractive pilfrometerÂ … but up to the present date nothing has been found to equal the transcendental hopper dadoscope.
Undoubtedly, the turbo-encabulator has now reached a very high level of technical development. It has been successfully used for operating nofer trunnions. In addition, whenever a barescent skor motion is required, it may be employed in conjunction with a drawn reciprocating dingle arm to reduce sinusoidal depleneration.
My short, strange affair with Yelp.com
A few months back I encountered the hot, hip, new web site Yelp.Â For those who don’t know Yelp is really a truly genius idea – a review site for restaurants and businesses that allows anyone to post a review.Â With just a quick glance I could tell that Yelp was better than longtime competitors like epinions or AOL CitySearch. Â I dove right in.Â I read some reviews, even checked out some new restaurants based on the recommendations I found on Yelp.Â And in time, I registered an account and wrote my own reviews.
It was in those first few days of review writing that I noticed things to be a little strange around there.Â I wrote about a hundred reviews – mostly about a paragraph-long – in just a few days.Â This raised the attention of the established Yelp community, who started a thread about me in the forums characterizing me as a “speed freak”, among other things.Â I shrugged it off without much thought – it’s a review site, and I wrote reviews, what could be wrong with that?
As time rolled on, I began to venture into the Yelp user forums.Â My first post was asking for opinions on a tailor in San Diego – the type of tailor that I could trust with an expensive suit, lamenting the typical three-dollar hack-n-slash shops that could be found around San Diego.Â The responses surprised me.Â I was basically told that “no one in San Diego wears expensive suits”.Â Still undeterred, I engaged in more conversations, about restaurants, San Diego, and other topics of general interest.
But after awhile, things started to go south.Â My wife visited a day spa ranked as “pricey” and “upscale” by the Yelp community.Â It turned out to be a typical Vietnamese clip-n’-buff joint, replete with the nasty “towel full of toe-nail clippings” she so deseparately tried to avoid.Â A few recommendations went bad too.Â An Italian place was just OK, but had dozens of raves on Yelp. And some of my favorite places were getting terrible reviews.Â Was it me?
Still undeterred, I continued to read, lurk, and contribute to the site.Â But soon it would dawn on me: Yelp sucks.
The problem with Yelp is not the site itself. In fact, the site is great.Â It makes it easy to find new businesses and read opinions about them.Â I’s well laid out with good maps, easily googled too.Â No, the problem is not the site: the problem is the Yelpers.
The reality is that 90% of Yelp users are young. The Yelp demographic breaks down to mostly college kids and twenty-somethings.Â There are a few weird thirtysomethings thrown in as well. Now there is nothing wrong with young people.Â However most young people share a few common situations which can affect their rating and ranking of a business.Â The common thread, I realized, was that most Yelpers are broke.Â So anything cheap, is great to them.Â And anything free… especially things given to them as part of their Yelp Elite status… is treasured.
And lets face it, how many fine dining experiences can a twenty-five year old really ever have had?Â Â Aside from a few graduation dinners, maybe two or three.Â So does it really make sense for me to base a decision on a restaurant on the guidance of a person who’s only eaten at a nice restaurant with his parents?Â Â And while I realize 100 bucks a plate is not cheap, 30 bucks a plate is not expensive either, which appears to be about the threshold for most Yelpers.
In the end I cancelled my account and deleted my reviews on that site after realizing I was hopelessly out-of-place.Â My reviews, while useful to someone out there, were dwarfed by the hundreds of reviews by cheeky sorority girls, graphic artists, and internet marketers.
Web 2.0 Expo musings
This week I attended the Web 2.0 Expo at the Moscone Center in San Francisco. It was an interesting scene both in the content presented and the attendees themselves. Like any event of this size, the speakers, panels, and sessions are hit-and-miss, in my case with duds-vs-great about 50/50. Some observations in no particular order:
- Yahoo seems to be pushing forward with a very big and consumer-focused product line. Google is pushing the infrastructure and apps. I don’t think Google and Yahoo are really as competitive as it may seem.
- Google’s basically got a thing where if your app runs on python, you can run it on their systems.Â Â That’s just wild.
- Yahoo threw a great party with free kegs, and thanks to all the little startups who’s free beer I enjoyed.
- Inarticulate 20-somethings may be getting millions handed to them, but they are impossible to listen to for 23 minutes.
- The company that created FunWall and SuperPoke closed a fifty freaking million dollar round of funding.
- Microsoft was treated like a now-dethroned bully who got his comeuppance, with any Microsoft-related joke or jab getting laughs. Ipods, Iphones, Macbooks were de rigeur. Woe for the MSFT product pitchman who has to face this group.
- MySpace was treated like a wildly-successful-but-immature younger brother : with equal parts envy and contempt.
- Clay Shirky was amazing, as was the guy who does the Fake Steve Jobs blog.
- Tim O’Reilly and Jonathan Schwartz were not as interesting as I thought they they’d be.
- Facebook and Twitter, et al. may be toys, but when you look at the numbers – the users, the downloads, the “tweets”, it’s pretty staggering.
- 90% of the business represented make their money on ads, or selling services to companies that do.
- The general trend is away from the ‘web site’ as being a basic and uniquely identifiable entity. It seems that without a social and participatory element, a lot of projects aren’t worth doing right now.
- Technically the cutting edge seems to be pretty stable – the focus has shifted away from language and platform and towards processes and methodologies.
- If you are serious about a web startup and getting funding you still do need to be in the Bay Area.
- If you are serious about your career as a web developer you still do need to be in the Bay Area
- I am not sure why but I found the levels of Iphone, BlackBerry and Laptop usage to be a little annoying.Â You paid to do this thing in person, why not actually be there?
- There were a lot of Germans and a lot of hipsters.
- There were a lot of people who were younger than me working on cooler stuff.
- San Francisco is a great city to visit, but is kinda sketchy sometimes, in a way that SD isn’t.
- SF appears to be overrun with hipsters.Â Â It is a little bit of a mini NYC scene over there.
- 8 hours of conference and then 6 hours of dinner and drinking is not sustainable for more than a couple days.
Words of Wisdom
Consulting: If you’re not part of the solution, there’s good money to be made in prolonging the problem.
The Pedantic Programmer
In my time working with and managing software developers, I have noticed some interesting personality traits which have the uncanny ability to make one completely nuts. Engineers in general have a reputation for being socially difficult; stereotypes range from the bearded stinky overweight guy to the geeky skinny guy with too-short pants and taped glasses. Reality is somewhere in-between, of course.
The Pedantic Programmer is the guy who refuses to think, unless it is to solve an immediate technical problem. He refuses to consider the needs of the customer, business, or any other human being. As far as he is concerned, the only solution to a problem is the ‘technically correct’ solution to the problem, regardless of the fact that humans are quite often not ‘technically correct’. The Pedantic Programmer often has a lot of experience, and often prefers a ‘back-office’-type environment where they are well-insulated from actual customers and users.
A Pedantic Programmer will never consult reality for guidance in his problem solving. He will always defer to the ‘architecture’ of the system. The limitations of the system will always trump the need for the system to serve the user who pays for it. The Pedantic Programmer doesn’t like fuzzy ideas. He doesn’t like incongruous business rules. He wants everything nice, tidy, and predictable, just like a computer. Common sense is his enemy – since it can’t be quantified.
The Pedantic Programmer’s best skill is Language Lawyering. He’s especially good at taking quasi-meaningful terms like “Profile” or “Master” and flipping them around at will, especially when he’s justifying a drop-down list of thousands of duplicate records, or explaining why he can’t change the size of that field because he’d have to ‘radically change the architecture’.
Some systems, especially those inherited from others, can make simple, trivial changes very difficult to implement in a clean way. While we all want to improve a system, sometimes it just doesn’t make sense to spend the money necessary to improve the system to solve a trivial problem. Sometimes, you have to eat a hack to live to fight another day. Most developers do understand this, but to the Pedantic Programmer, this is an opportunity to re-factor, re-engineer, and give a 5-day estimate to display “WA” instead of “Washington”. While admirable in his zeal to solve a problem, if we do this enough times, our customer will disappear.
Pedantic Programmers always want a spec. If you box them into a corner, they will always play the ‘spec card’ which is some combination of ‘give me a spec’ or ‘you never gave me a spec’. This is a programmer-speak for “Covering Your Ass”. Should I give a senior software engineer with 10 years experience a spec for “Show WA instead of Washington”? This makes managing the Pedantic Programmer very, very tough.
So what does a young professional do when confronted with the challenges of managing a Pedantic Programmer? You can write very detailed specs, but they will always end up wrong, outdated, or unread. You can micro-manage, but Pedantic Programmers don’t like social interaction and no one really likes micro-managing. You can argue with them, but they are often quite sensitive and tend to be easily disgruntled. You can cave in and allow the guy to spend 10 days solving things ‘the right way’, but your boss, the PM, and the CFO won’t like that idea much either.
In the end, you might just be screwed, especially if the Pedantic Programmer is the only guy who really understands some particularly wicked bit of the system. If you’re smart (or lucky) you’ve read Eric Sink’s Developers not Programmers and avoided the Hazards of Hiring and haven’t saddled yourself with a Pedantic Programmer. If you’re not that smart, well, there’s always booze.