Looking Back upon CSC148

The course CSC148 was definitely the first one to present interesting programming techniques and problems in the undergraduate program of computer science at the University of Toronto. After many hours dedicated to labs and assignments, I can say that I have a good understanding of recursive algorithms, a few abstract data structures, and some basic command line user interfaces in Python. Although the course itself could have been much more challenging (and it should be), it was overall a good experience and preparation for my academic future.

Me and my friends have disagreed on the ways some of the course content have been presented to us, and how the course could have been better if it was done in a lower level programming language (such as C++). We have even dreamed about the possibility of the faculty offering an enriched version of CSC148 with more in-depth exploration of data types in lower level languages. However, we were stuck with this course and I believe we made the best out of it. It was fun to solve problems and build solid applications with Python, and it was also good practice for future courses, such as CSC207.

As for a comparison of this course to its precursor, CSC108, this course definitely was more challenging, and requires a much deeper understanding of both Python and programming in general. Assignments, for example, are much more complex in CSC148. Additionally, examinations are slightly more complex in this course as well. In general, it is a considerable, yet not insurmountable, jump from CSC108.

I know I haven’t even mentioned the strike within this posts, and a part of me still feels like I shouldn’t. I mean it is definitely true that it was one of the major events throughout the course, but I feel that it’s not the best idea for me to voice my opinions about it here. Thus, I won’t be doing that. What I can say is that I am glad the strike is over, and that the wonderful TAs are back to look through my slightly interesting posts!

To all my classmates (who probably won’t read this), good luck with your life after CSC148,
Juan Camilo Osorio

Why Geeks Should Write (Revisited)

A couple of weeks ago, at the beginning of one of the courses at the University of Toronto, I wrote about why geeks should write. Today, I am revisiting this topic by going through some of my favorite blogs from my classmates, and how each one helped me either understand course content better, or be inspired to build something amazing. In a sense, I am demonstrating why geeks should write by being a geek who reads, thus giving value to those who took the time to write in the first place. From shared frustrations and code debugging to inspiring and intriguing posts, here are my picks of 2015’s CSC148:

Sooham Rafiz’s post on Assignment 3:

(View this post). My fellow classmate and friend Sooham takes a critical approach to Assignment 3, and, in particular, the suggested algorithm implementation given to us by the faculty. Since I was his teammate for this assignment, we shared the pain and struggle of implementing the (almost) correct pruning optimizations for minimax, and specially had a lot of trouble trying to understand intuitively what the algorithms was doing, since it was so different from common implementations of minimax (including MIT’s one). I strongly support his views on the topic, and encourage the faculty to consider changing the algorithm for future iterations of the course.

Being able to read about other people’s frustrations and struggles is great. No, not in a sadistic way. It’s just good to see that you are not the only one having trouble accomplishing something. Some things aren’t as easy as they seem, and maybe reading about somebody’s approach to a challenge will help you approach a similar one. Knowledge acquired by solving issues is extremely valuable, and being able to grow from obstacles, even if they aren’t your own, is great! I said it once and I’ll say it again: the first reason why geeks should write is that other geeks are very likely to benefit from your geeky endeavours.

Chris Lessard’s post on efficient Binary Search Trees:

(View this post). Chris explores more efficient ways to insert a list of 5000 integers into a binary search tree than were ever mentioned in the course, including red-black search trees that could potentially be used to implement a self-balancing search tree (which is great!). I really enjoyed reading about someone else that thought of extending the basic requirements of the course, trying something different. I posted about my own solution to this problem, which turned out to be quite similar to one of the discussed ones on this post.

It is great to be able to read about different approaches that each person has towards the same problem. If I had not read this post, I wouldn’t have learned about read-black binary trees, and thus my learning would be just that one bit less exciting. I encourage you, the reader, to write about your geeky (or not so geeky) approaches to obstacles that you face in your field, and I’m confident that somebody will benefit from your posts!

Sophie Wong’s take on Why Geeks should Write:

(View this post). I decided to add this article as some sort of meta inception of my own post. I am reviewing a post about why geeks should write on a post that revisits this topic, which is pretty cool. In any case, Sophie mentions a reason why geeks should write that had completely escaped me: we might actually forget how to use normal languages if we don’t practice them. Of course, Sophie (and geeks in general) aren’t worried about not being able to communicate normally. It is more about how a lack of practice in oral and written communication (or anything for that matter) can lead to some deterioration of skills. I found her article both funny and thought provoking. Overall, a great read!

So, if the previous reasons that I have given as to why geeks should write, such as having other geeks benefit from whatever you are writing about, and being able to share your experiences with like-minded people, haven’t convinced you, the last reason why you should begin writing your own blog is that geeks must practice their “soft” skills (such as written communication) as much as they practice their programming/technical skills! The last thing you want is for your rusty writing skills to prevent you from getting that dream job of yours, so get on writing today!

I admit that this post is basically a long persuasive article on why you should start writing your own blog, but I think is worth it, especially since I will potentially end up reading whatever you write when I get stuck on a similar obstacle you have overcome or just reading something inspiring you have done. It is all for my own benefit.

In all seriousness though, I hope that you enjoyed reading about some of the blog posts from students at the University of Toronto, and that you have appreciated (just a little) the wonders of blogging. Whether or not you join this pursue is up to you (or maybe you already did?).

Keep giving me more reasons to think geeks should indeed keep being awesome,
Juan Camilo Osorio

Why Unit Testing Might Seem Like a Waste of Time

Most of us in first year don’t have much (or any) practice in production environments, and we usually work with small code bases and relatively simple software. To me, and many others, unit testing is a complete waste of time, given that we could just check things manually in less time, and our code seldom breaks randomly. If you are one of us who doesn’t see much benefit in testing, don’t worry, I’ve been there.

For example, for the second assignment for CSC148 at the University of Toronto, I spent a good two hours trying to figure out why my doctest wasn’t passing, and the reason turned out to have nothing to do with the actual code I had written. I had basically understood doctests wrong for the longest time. The problem was that I was trying to do two things that don’t go well together: check for newline characters as part of the result of a function call, and split the lines as to meet the “lines must contain 80 characters or less” style guidelines for the course. You can witness part of my struggle, and how I solved it, in this StackOverflow question.

However, consider a project that has thousands of lines of code and a few thousand methods or functions. Now check each one manually. Not fun, right? My point is that, although unit testing might seem pointless in the early days, it will be more than worth it once you join a company or start your own large code base. Writing documentation and making sure your code works with doctests or unit tests is something that we learn out of context. I think that a great idea that beginner courses such as CSC148 could do is have a class-wide project as part of the course, where unit testing and documenting code suddenly becomes crucial to the success of the project.

One of the things I have done to practice writing good test cases and documentation is go through dozens of API documentations, such as Facebook’s, Pinterest’s, or Spotify’s. Although these are mostly web-based, they can help get a general understanding of production-level documentation and testing.

Keep unit testing,
Juan Camilo Osorio

Real Life Abstract Data Types

When you are a computer scientist, you begin to see abstract data types everywhere. From the queues to get food at a cafeteria, to stacks of paper in your desk, and even complex objects such as people, which have different properties and methods. You see, this is the very reason why having abstract data types makes sense in the first place: they can be used to model or describe real-world environments or situations. Strings, numbers, and boolean values are seldom useful if you are trying to describe virtually what a person is, or how a line queue operates. Thus, having a type where you can .enqueue() and .dequeue() people is convenient.

For example, in class, we were asked to implement a minimax algorithm in order to play an arbitrary two-person game (such as tic-tac-toe or subtract square). There, we could have used strings to store each state of the game, and would have had to create a crazy function to parse them and apply the moves, generating new states in order to assess what the best move was. But that’s insane and counter intuitive. That’s why we used recursion and an abstract data type that we created for game states, which allowed us to easily compute new states by applying moves to them, and thus made recursion 100x easier. What would life be without some good ol’ object oriented programming and abstract data types?

Inserting a sorted list into a binary search tree.

In class, we have been looking at binary search trees, and how they can be used to efficiently perform certain actions. We have also learned that some operations, such as finding whether or not a certain binary search tree contains a value or not, have a time complexity proportional to their height, and therefore keeping the height of a tree to a minimum is desirable.

Specifically, in a lab, we were told to find a way to insert a list of values into a BST while keeping the height of such small. Although we were advised to use a random selection in order to achieve this, I was convinced something more efficient was possible. It turns out I was right. By pre-sorting the list of values to be inserted into the binary search tree, it is possible to construct a recursive algorithm that inserts each element in a specific order as to obtain a balanced search tree. Consider the following:

Comparison of Tree Heights.

Since the list is sorted in a non-decreasing order, it definitely doesn’t make sense to insert the elements in the order they are. This is because the naive tree would be produced. Alternatively, we could shuffle randomly the tree (which I believe is what the course instructors wanted us to do in the assignment). However, we can do better. In fact, we can get the smart tree every time.The code I came up with in order to guarantee this is the following:

class BST:
    # ... other methods here
   def insert(self, data):
       """Insert data, if necessary, into this tree.

       Precondition: if data is a list, it is pre-sorted in
       non-decreasing order.

       """
       # Insert list
       if(isinstance(data, list)):
           midata = len(data) // 2
           self.insert(data[midata])
           if(len(data) > 1):
               self.insert(data[:midata])
           if(len(data) > 2):
               self.insert(data[midata + 1:])
       else:
           self._root = _insert(self._root, data)

Here, _insert(...) is a function that adds the value data to self in the correct position. What’s important is that, by choosing the middle element of each array and sub-array it is possible to form a balanced and efficient binary search tree. It is important to note that the list being inserted should be sorted, otherwise the method would not produce a particularly efficient binary tree.

If you are interested in seeing the entire code (including the function _insert, and the other methods of BST), you can download my solution to the entire assignment here!