Python vs C#

I thought I’d post about something different for a change, just to prove I have interested outside of gardening. And since I was debating this topic with a colleague on Friday, it seemed like a good place to start.

Static vs Dynamic Typing

I should explain that I work in the digital department of a medium sized engineering company. My department is divided between people who are more focused on developing algorithms to solve interesting problems (‘modellers’) and people whose focus is building production IT systems (‘software engineers’). Obviously these two sides tend to favour different languages, thus the Python vs C# debate.

The debate at the time focused on whether static typing is a major plus in choosing a language or not. I personally don’t think that static types should be the deciding factor in choosing a language – not because I find it hard to use a statically typed language, but because my experience is that static types typically only really catch the easy errors early.

Of course, it’s useful to be immediately told if you’re trying to add a string to a number. But generally such errors are identified even without static checking as long as your testing is thorough enough to cover your code base properly. Where static types don’t help is with the hard to find errors, where you’re doing the algorithmically wrong thing with the right types, and those are the ones where you lose a lot of time. The only way to find those is to do thorough testing, which tends to catch your type errors in any case.

Of course, a better type system can make static typing slightly more useful by making the types of functions more descriptive of what they should do, and by ruling out at compile time more incorrect behaviour.

For example, in the past I’ve written code outside of work in Haskell, which has a very strong but powerful type system. In Haskell, the type of a function to get the length of a list is:

length :: [a] -> Int

Here, ‘a’ represents any type at all. The function can take a list of values of any type and return a number. The type signature also encodes that length does no IO or anything else non-deterministic.

Given that we know that ‘length’ is a deterministic function that does no IO, and that it knows nothing about the type of value containing in the list it takes as an input, and it returns a number, there’s a very limited number of things it could do, and a lot of things the compiler can object to up front.

Broadening the Argument

Of course, my debating partner didn’t agree with my argument. But thinking about it afterwards, that was the tip of the iceberg. Here’s some other criteria where I’d disagree with common practice:

  1. Benefits of OO (Object Orientation)
  2. Reference vs value programming

I think from our discussion that my colleague would disagree with me on OO, but agree with me on the importance of restricting shared references.

Object Orientation

Object orientation is probably the most popular programming paradigm around right now. And for good reason, since it directly encodes two human tendencies / common ways of thinking:

  1. Organising things into hierarchies
  2. Ascribing processes to entities

Now, I don’t deny that OO can be a helpful way to structure your thinking for some problems, although helpful isn’t the same as necessary. What I have a bigger problem with is what you might call the OO fundamentalism that’s taken over much of the field.

Depending on the language, you might find things like:

  • The inability for a function to just be a function without an owner

Do Cos and Sin really need to belong to a class rather than just a library of functions? Do you really need to use a Visitor pattern instead of just passing a first class function to another function? Does a commutative function like add really ‘belong’ to either of the things being added?

  • The inability to separate shared interfaces from inheritance

In some languages, interfaces as a concept don’t exist. In others, they do exist but are under-utilised in the standard library, meaning that in practice you’re often forced to build a subclass when all you really want is to implement a specific interface required by the function you want to call.

In Python, this issue is solved by duck typing. In at least one non-OO statically typed language (Haskell), it’s solved by type classes. In C# it’s inconsistently solved by interfaces.

  • WORSE: interfaces only by sub-classing, and single inheritance so classes can effectively only implement one ‘interface’ at a time

Who decided to make it so hard to specify the actual interfacing standards in a generic way?

  • The tendency for OO languages to encourage in-place mutation of values and reliance on identity rather than value in computation as the default, rather than as a limited performance enhancing measure

It’s now widely agreed that too much global state is a bad thing in programming. But the badness of global state is really just an extreme of the badness of directly mutating the same memory from many different locations. The more you do this without clear controls, the harder it is to debug the resulting program. And yet the most common programming paradigm around encourages, in almost all cases, in-place mutation and reference sharing as the default.

This annoys me so much I’m going to write a small section about it.

References vs Values

Let’s illustrate the problem with a simple Python example shall we? In python you can multiply a list by a number to get multiple copies of the same values, for example:

[1] * 3 = [1,1,1]

Now, let’s imagine we want a list of 3 lists:

[[1]] * 3 = [[1], [1], [1]]

Let’s say we take our list of lists and add something to the first list.

x = [[1]] * 3

x[0].append(2)

What do you think the value of x is now? Do you think it’s [[1,2], [1], [1]]? If you do then you’re wrong. In fact it’s [[1,2], [1,2], [1,2]], because all elements of the list refer to exactly the same object.

How dumb is this? And to make it even worse, like many OO languages Python has a largely hidden value vs reference type distinction, so the following does work as expected:

x = [1] * 3

x[0] += 1

You get the expected x = [2,1,1] as a result.

So you have a pervasive tendency for the language to promote object sharing and mutability, which together mean you have to be incredibly careful to explicitly copy things otherwise you end up corrupting the data other parts of your program are using. And unlike in C, where for the most part it’s clear what’s a pointer and what’s not, you also have an unmarked lack of consistency between types which do this and types where operations are by value.

Similar issues occur in most object orientated languages, creating brittle programs with vast amounts of hidden shared state for no obvious benefit in most cases. It would be better to have special syntax for the limited cases where shared state is important for performance, but that’s not the way most of the world went. And now we’re paying the price, since the reference model breaks completely in a massively-parallel world.

So – Python vs C#?

How do Python and C# stack up on all three criteria?

  • Typing

Both are strongly typed, but C# has static typing while Python doesn’t.

As I said, for me C#’s type system isn’t clever enough to catch most of the hard bugs, so I think it only earns a few correctness points while losing flexibility points.

No overall winner.

  • Object orientation

Both languages are mostly object orientated. Python is less insistent on your own code being OO than C#, and will happily let you write procedural or semi-functional code as long as you don’t mind the standard library being mostly composed of objects.

C# has interfaces as a separate concept, but for added inconsistency also uses sub-classes for shared interfaces. Python mostly does shared interfaces by duck typing, which is of course ultra-flexible but relies on thorough testing as the only way to check compliance to the required interface.

Since I don’t think OO is always the best way to structure a problem, I’d give the points to Python on this.

  • References vs values, aka hidden pointers galore

Both Python and C# have the same disturbing tendency to make you work hard to limit shared state, and promote bugs by choosing the dangerous option as the default.

Both languages lose here.

So in terms of a good experience writing code, I’m inclined to give Python the advantage, but to be honest there isn’t much in it. For me, other factors are much more important, such as availability of functionality required for a project, avalability of others in the team with the right skillset for ongoing maintenance, and very occasionally performance (Python is not fast, but this doesn’t matter for most projects).

Now if a good functional language like Haskell or Ocaml would just become common enough to solve the library and available personnel issue, we’d at least have the opposite extreme as an option (value over reference, less or no OO). Then maybe in another decade we could find a compromise somewhere in the middle…

2 thoughts on “Python vs C#

  1. I’ve read this post time and time again. I don’t know how to reply 🙂

    I’m not a fan of OO but I have tried a lot of different languages. I grew up programming in the 80’s when it was BASIC, Assembly etc on home PCs. I’m not a fan of the old BASIC with little or no structure. Outside of that I see a programming language as needing to fit the personality of the programmer and being suitable to the task in hand. I’ve been an electronic engineer, a sys admin for an ISP as well as being a hobbyist electronic designer. I’ve done commercial programming as well as owning an ISP and seeing internet security issues with customers. I’ve seen a lot of other programmers and obviously as a consumer I’ve seen a lot of electronic designs controlled by other peoples programs. All of this has shaped me.

    To be honest, I never get down to thinking about programming languages as you have, I get bogged down and very wound up with more obvious issues that I see.

    OO is, I would guess and suggest, suited to Modelling but beyond that I think a language should be tailored to the task at hand. When programming electronics I think a simple procedural language is best, using referencing and simple functions, like C with no OO. With programming a website the flow of the program should mimic the flow of the visitor through the site, again procedural. PHP is perfect for this.

    Writing programs for a WIMP environment, Visual Basic, Visual C, event driven languages seem ideal.

    From what I observe very very few people program simply. It was always the same with computer networking, the overall design of the program or network gets very complicated, unnecessarily and I think this shows up with bugs and security issues. If a programmer ends up with a program with memory leaks or security holes it suggests to me they can’t get their head around the issues, normally thrown up by the language itself. I think the wrong language for the programmer has been chosen. A program with bugs, or one that behaves unpredictably is often a sign, often down, to or of commercial pressures – not enough testing done.

    When I was writing an app for a mobile device, android, it became clear that the whole platform was doing it’s best to hide as much of the underlying working of the phone from me as possible. Too much was done for the programmer meaning that the programmer wasn’t forced in understanding the object he was programming. Normally in anything modern, like a mobile device or even programming on a Windows based PC you aren’t programming a computer you are programming another program which then controls the hardware. I think this approach makes programmers that don’t know anything about the beast they are writing programs for.

    I see kids learning to program using scratch, rather that getting down and dirty with variables, they get shielded from types, binary, integers, floating points etc etc I don’t think this is good.

    I find it very difficult to get into the sort of conversation you are having because I get hung up with problems I see caused by poor programmers. The latest example which made my blood boil is a security camera. One of the biggest selling ones. A Tenvis camera, aimed at Joe Bloggs, for home use. It promises easy use and makes securing your home easy. The camera works from a phone app. It doesn’t matter what language they used to program in because it’s so bad and anything they used would have been wrong. The app doesn’t work or locks up. The camera sometimes records, sometimes not. It sometimes notifies me with an alarm sometimes not. You have to connect a network cable to it to set it up, then use it on wifi but if you decide to keep the cable in (because of dodgy wifi) all security and passwords are by-passed. It doesn’t enforce a password either. I can see my neighbours camera, back garden, front garden and even inside his living room and kitchen. I put passwords on mine only to find they only work on the video side, my neighbour could hear my voice even though he couldn’t see a picture without the password. There is no way to disable the microphone, only turn down the volume. The web interface doesn’t work with any browser because all modern browsers block the type of plugin the software uses….and this is one of the best selling cameras. Pathetic programming, pathetic testing and loads of bugs. Designed for security it actually makes security worse. nothing good about it, it can’t even do what it says on the tin. Then when you use the app away from the house it streams and uses up all your data allowance even when you aren’t viewing the image. This is today’s programmers making today’s products.

    When I see things like this, the detailed and interesting topic you’ve started just seems pointless and I turn away 🙂 Nothing to do with you or your post, it’s just where I’m at with the modern world.

    End of rant 🙂 Keep up the different and interesting posts

    • I guess you were learning to program much earlier than I was, but the starting point was probably similar. The first language I ever learned was old-style BASIC on an Amstrad (I think!) word processor, complete with line numbers, and then QBasic and Visual Basic 3 on an IBM 386 (486?). Then later I’ve tried everything from C to MATLAB to Python to PHP to Haskell, either for work or for hobby projects. And in that era of cooperative multitasking (or no multitasking!) I accidentally found a thousand ways to lock up or crash the machine, which is the kind of experience you don’t get in scratch, as you say.

      I’m not sure OO is even universally good for modelling. It’s good for a certain kind of simulation type model, and I think the model works quite well in certain restricted domains like GUI widget hierarchies and GUI construction. I think it can be a good approach sometimes, but then ‘if all you have is a hammer, everything looks like a nail’ kicks in. I definitely think there’s something to be said for procedural programming. We wrote a lot of models where there was a clear linear set of processing steps, and happily wrote them a few thousand lines of basically procedural code.

      The issue of simplicity is another thing I don’t like about C#. I think the language is too baroque. On the other hand it goes down the Java route of trying to make life ‘safe’, but their solution is to add this big architecture of similar, overlapping constructs to handle all the corner cases that their ‘safe’ system made difficult. I agree that it’s important to pick people who understand what they’re trying to do and then give them tools that support them rather than fighting them too much.

      One issue with understanding the hardware is that the hardware itself has become harder to understand. The CPU in your computer (at least if it’s x86) probably has a CISC -> RISC translation layer, following by a complicated system of parallel instruction execution, branch prediction and speculative calculation, etc. Even if you directly interact with the hardware, the hardware itself has become a bit like the software environment but in miniature.

      The problem with your camera is probably that generic programmers were hired to write the software by some HR droid with no understanding of what they actually wanted to build. Writing stable, secure network applications is a discipline in its own right, and relying on standard libraries and frameworks only takes you so far. If you hire a load of people who’re come out of a course or degree with a basic understanding of Java or whatever but no past experience in the problem domain, you’re going to come to grief. HR have been on a multi-decade quest to make “programmer” a standard job, but it just isn’t in the same way that “artist” isn’t a standard job. If you want Rembrandt and you hire Picasso, you’re not going to be happy with the result.

      Having said all that, I think better tools can help, just not the ones people are producing. There’s been a lot of work hiding pointers and memory management from people, and adding types galore and complexity. And this solves some problems, but it creates a lot of other ones because the complexity means it’s easier to **** up, because you have to try to fit a lot in your head simultaneously.

      Java or C# are not really useable without a clever IDE. Visual Studio is extremely impressive, but it has to be because otherwise using C# would be painful. Writing a large C# program or library with only a text editor would not be an enjoyable experience.

      And at the same time these frameworks are ignoring one of the biggest problems there is by encouraging shared references and making it hard to work with *values*. In C, it’s at least obvious when you’re working with shared references because you know what’s a pointer and what isn’t, but this distinction is completely hidden in most modern OO languages which encourages use of references/pointers without thinking about what might be broken in the process.

Leave a Reply

Your email address will not be published. Required fields are marked *