~/

Getting into Competitive Programming

fenwick tree, segment tree, max-flow, sqrt decomposition, union-find, convex hull, ??!

“What exactly are those?!”
That was me back in my freshman year of school when friends talked about them.

My curiosity was sparked, and I started looking into those terms. It turned out that those were common data structures, algorithms, and techniques in competitive programming. Soon, I learned that it was a common thing to take part in programming contests during high school in Canada, and this could lead to a place in the coveted International Olympiad in Informatics (IOI) competition.

I started getting more familiar with jargons like those, but I never knew how they work and when to apply them. I tried looking for resources from time to time to see how I could get into the competitive programming field, but I felt that there was a huge learning curve as my knowledge in data structures and algorithms was quite shallow during that time. Most of the articles I read on websites like Reddit and Quora eventually led me to Topcoder and Codeforces, but I felt that those platforms were too daunting to me.

I was fortunate enough to stumble across a couple of university classes in late 2018, and some of them were pretty promising:

One that stood out was CS 3233 from the National University of Singapore (NUS). It is taught by Steven Halim, the author of the popular Competitive Programming textbook and Singapore’s IOI team leader. During that time, I was considering to do an exchange term at NUS (which I eventually did) just to enroll in the class.

There were a couple of questions on the website that I had to consider:

  1. “Do you have national (but preferably international) programming competition background before? Examples: NOI (or IDN OSN, VNM VNOI, CHN NOIP, MYS MCO, PHL NOI, IND ICO, etc), IOI, ICPC (especially ICPC Asia Singapore Regional Contest 2015/2018), Google Code Jam, Facebook Hacker Cup, Topcoder Open, etc?”

  2. “Are you OK to have your coding style ‘somewhat damaged’ because of this course?”

I was shocked when I saw those two questions on the course’s website. Initially, I was afraid that the class would be too challenging for me because I do not meet the prerequisites (which usually take years to prepare, I think). Also, I would have never imagined myself writing code like the ones described in the screenshot above. After careful consideration, I thought it would be a good idea to challenge myself and learn something completely new to me from scratch.

Preparation (November - December 2018)

To enroll into the course without meeting the prerequisites, one needs to show that he or she has achieved a score of at least 500 on Kattis, an online judge with a ton of programming problems, before January 1st. The score is the sum of the difficulties of the problems that one has solved, and the difficulty of each problem ranges between 1.1 (super easy) to 9.6 (extremely difficult - i.e. might involve multiple data structures at the same time).

I picked up a copy of the Competitive Programming 3 book and started working through it. Most Kattis problems that I worked on were between 1 to 3 pointers that involve basic data structures and algorithms (graph traversal, sorting, backtracking, etc.). Unlike interview-related problems which are straightforward, contest problems are fairly convoluted, and they come with a storyline. The goal is to identify the problem in the text and solve it using a solution that is fast enough to pass all the test cases in terms of both correctness and time constraints. For actual contests, the speed in which a participant solves a problem matters.

As I work on problems, I started building confidence in solving them. Most of them were actually really interesting! 20 problems a day… 30 problems a day… 50 problems a day… Annnd, I finally made it to 500 points, so yay! :tada:

Kattis does not provide a graph that shows progress over time, so I wrote a tool that scrapes data from my profile periodically since I wanted to track my progress.

Here are the results!

The Flipped-Classroom Experience (January - April 2019)

“We only study well-known/solved problems, not research problems.”

This class is about problem solving and algorithms, and it aims to prepare students who want to compete in the ACM ICPC or IOI competition. There were about 20 Singaporean high school students who attended the class as part of their IOI training, and about half of the class (excluding high school students) has IOI/IMO/ICPC World Finals background (and most of them have medals :scream:).

The entire class environment was really competitive, but I liked it as it helped me to learn better and faster. While I was struggling to understand how those data structures and algorithms work, some students were already discussing the performance and limitations of those algorithms. :persevere:

Here’s a typical schedule of the class:

  • 5.30pm - 6.45pm: Individual Mini Contest (typically 3 problems)
  • 6.45pm - 7.15pm: Contest Debrief by our tutor Robin (also 3-time IOI medallist)
  • 7.15pm - 7.30pm: Break/Free Food
  • 7.30pm - 9.00pm: Lecture by Steven
  • 9.00pm - ??? : Homework + more Kattis problems at home for practice!

Unlike the actual ACM ICPC competition which only allows up to 25 pages of printed notes for each team, the mini contest allows unlimited digital notes, but not the Internet.

Fun fact: some of the mini contests were sponsored (i.e. with cash prizes :dollar: ).

The entire class was basically 12 days of algorithmic training, except that it was spread out over the course of 4 months. We covered topics related to Graph Modeling, Dynamic Programming, Graph Theory/Network Flow (e.g. push-relabel), NP-hard (e.g. Travelling Salesman Problem, Minimum Vertex Cover problem), Computational Geometry (e.g. convex hull, rotating calipers), etc. Those topics were interesting, and I felt that the class not only gave me an exposure to a whole new world of data structures and algorithms but also improved my problem-solving skills.

Besides the regular class schedule, there were two team contests: midterm and final. The midterm team contest was very difficult. Most teams only AC-ed 1 to 2 problems out of 10 problems within 4 hours. The final team contest was reasonable, though we ended up at the bottom 15%! :satisfied:

Common acronyms:
Accepted (AC), Presentation Error (PE), Wrong Answer (WA), Time Limit Exceeded (TLE), Memory Limit Exceeded (MLE)

Reflections

Before this class, I knew very little about applying data structures and algorithms that I have learned back in Waterloo. Now all those proofs in my Number Theory, Combinatorics, and Graph Theory classes made sense, to some extent. Throughout the whole term, there were a lot of aha-moments as I could easily relate back to the theoretical stuff back in Waterloo, and this is something that I really value a lot since I am more of a practical person.

However, the class was not easy at all, especially for those who do not have any prior experience in programming contests. Ultimately, competitive programming goes back to coding under time pressure. One requires both speed and strong algorithmic thinking skills to excel in the field, and it is rather challenging to work on both in just four months.

I spent most of my time in NUS focusing on the latter by working on a variety of problems without time pressure. Though I feel that I am still quite slow in competitive coding, I am very satisfied with my progress since the goal was to explore competitive programming. The graph below illustrates where I started and where I ended up.

I also learned to write code like this! :joy:

int m, x, y, res, n, OK;
int HOLES[12]; // cols

inline void count_solutions(int row, int diag1, int diag2, int col) {
  if (row == OK) {
    ++res;
    return;
  }
  int pos = OK & (~(row | diag1 | diag2 | HOLES[col]));
  while (pos) { // pos gives you all bits for rows available for next col
    int p = pos & -pos;
    pos -= p;
    count_solutions(row | p, (diag1 | p) << 1, (diag2 | p) >> 1, col + 1);
  }
}

For the data enthusiasts, here are some statistics on the problems I have solved on Kattis throughout the past 6 months (2 for preparation + 4 for class):

507 problems in total with an average difficulty of 3.16. Over the course of 4 months, I have gained over 1,000 points on Kattis, which I think is amazing, but I have also seen a classmate who progressed from 1,000 to 4,500 points :boom:. At the end of the day, it depends on how much time you put in and how long it takes to solve a problem. Practice makes perfect. :slightly_smiling_face:

Note that the difficulty of problems increases as one solves more problems since fewer questions will be left in the pool.


Peter Buhr (a Waterloo concurrency prof) once said that concurrency completes the software development part of a Computer Science (CS) degree after all the object-oriented and design patterns stuff. Now that I think about it, perhaps Competitive Programming completes the algorithmic part of a CS degree after all the mathematical proofs.

Back at Shopify, I remember asking Simon Eskildsen, my team lead who was once an IOI participant, whether taking time off school and practicing on contest problems for a few months would bring someone without any background in IOI/IMO to ICPC, and his answer was something along the lines of “extremely challenging”. Now I know why! :slightly_smiling_face:

Want to get into Competitive Programming?

Well, it’s not too late! However, be prepared to be mentally challenged. I would first recommend getting familiar with at least one of C++, Java, or Python, and have some understanding of basic data structures and algorithms (e.g. arrays, stacks, queues, maps, BFS, DFS). Most competitive programmers tend to use one of those programming languages since they are pretty robust and come with many built-in libraries, which will make coding easier. Finally, everything else is just practice.

Practice, practice, practice, and read editorial solutions at the same time to gain more knowledge, but do not memorize them. Lots of practice will help train mental sharpness in knowing which data structure or algorithm is relevant when a problem is presented.

If you are new to data structures and algorithms, I would recommend starting out with interview-like problems on HackerRank and LeetCode to be familiar with them. Once you feel more comfortable, move on to easy Kattis problems (1 - 2 points) and slowly work on harder ones. If you need hints, you can check out Methods to Solve or use the Kattis Hint Chrome extension (if you are on Google Chrome). If you need some guided content, the Competitive Programming textbook might help.

A good thing to keep in mind is that a lot of problems on Kattis involve applying specific algorithms directly without modifying them. When it comes to real competitive programming, it is no longer that, but more on a combination of problem-solving paradigms (e.g. binary search + dynamic programming + graph traversal), which usually requires creativity and deep understanding of those data structures and algorithms.

I wasn’t that into CodeForces, but I heard that starting off with Div. 3 or Educational contests will help a lot in speed. Problems in those categories usually focus on fundamentals and do not require any special data structures or algorithms. In the long run, the target would be Div. 2 and Div. 1. There is a decent CodeForces discussion on becoming better in competitive programming (i.e. leveling up to a higher rank).

As you work on more problems, you will feel that writing the basic code structure (e.g. int main() {...} and typedefs) gets tedious, so building your own template (like mine in C++) will be really helpful. At the same time, I would recommend keeping a repository of code of data structures and algorithms for future reference. Here are some examples:


Here is a compilation of stuff (non-exhaustive) that I have learned throughout the six months. Some of them can be seen as bad software engineering practices, so use them at your discretion:

SUMMARY (expand)
  • Be creative in coming out with solutions. For example, wrapping a 2D cell with -1s can avoid code that handles bounds.
  • Some online judges (e.g. Kattis) will trim whitespaces for you, but some wouldn't. So be careful in your output format to prevent getting a presentation error.
  • `std::set` in C++ can act as a min/max priority queue at the same time.
  • Not all problems can be solved easily using C++. Look at bounds. Java BigInteger or Python might help.
  • Always observe the bounds. For example, if the range of input is between 9 and 10 billion, we could map them to 0 to 1 billion so that it will fit in an integer. Technically, we can use `long long` for this, but that's just an example. This can avoid Java BigInteger sometimes. Or maybe, prevent yourself from initializing a Fenwick Tree of size 1 billion, when all you need is just 250,000.
  • Know your numbers and be good in estimating the number of operations. You don't need a linear or logarithmic solution for everything. If the input size is 10, even `O(N^4)` would work.
  • Avoid storing values in structs. Struct creation is slow. Use multiple arrays instead: `int w[MAXN], h[MAXN]`.
  • Do not print stuff in Kattis without initializing it. It will give WA even if you modified it through referencing.
  • Always remember to reset states for variables declared globally! (I made this mistake a lot.)
  • Try to deal with integers rather than floating points if possible. Sometimes, we could multiply floating points numbers with LCM of all numbers to get integers.
  • Be careful of using `long double` with `printf()`! This feature is broken in MinGW (GCC port for Windows). This means that if the judge is using MinGW, we're in trouble. Solution: use `cout` instead, or convert `long double` to `double`.
  • Use `getchar_unlocked()` instead of `scanf()` if you want to squeeze out that extra running time.
  • Backtracking with bitmask could speed runtime up by close to twice. (Might be irrelevant to the actual contest.)
  • `ulimit -s [SIZE]` to change stack size locally. Helpful when testing large input cases. See `ulimit -H -a` to see hard limit.
  • Learn to look at problems backward. Solving it from the front could be difficult! See Kattis: artwork (hint: union-find from the back).
  • Pattern finding: generate output for small instances and use OEIS to search for sequence (useful for online contests).
  • Offline algorithms: only works if the range of output is small. Run your naive algorithm in the background to print out all possible values. `O(1)` to get your answer.
  • Watch webcasts and see how other competitive programmers code.
  • Most importantly, you don't always need fancy data structures. Focus on basic problem-solving. Learn to modify algorithms instead of applying them directly.
That's all I have!

Also, check out this ACM ICPC World Finals 2019 live stream by a couple of amazing competitive programmers (See tourist, Petr, and Endagorion on CodeForces). I started to appreciate screencasts like this after actually being in a programming contest itself.

What’s next?

I recall that I mentioned last year that I’d like to look into competitive programming for the rest of my 2018. Well, that worked out! :raised_hands: I don’t plan to become a competitive programmer in the future, but I guess I have a new hobby now. Perhaps focusing on improving my speed through CodeForces would be helpful…

It’s time to go back into software development. I just joined Cockroach Labs last month as an intern on the SRE team, and it’s pretty interesting to see stuff related to competitive programming in CockroachDB’s source code! Here are some examples:

It seems like the SQL optimizer team at Cockroach Labs does stuff that has a lot of overlap with competitive programming, so if you’re interested… :wink:

TIL: I have always thought we don’t need fancy data structures and algorithms in software development, but I guess that is not the case!

To Steven and Robin, thank you for an amazing class experience!

6 months of competitive programming. Was it worth it? Absolutely.

But seriously, who would have thought of using extensive bitwise techniques in recursive backtracking or doing a breadth-first-search from source and target at the same time (meet-in-the-middle technique) in production?!

It’s fascinating, isn’t it?


P.S. I temporarily suspended the cronjob on Kattis Tracker since I was hitting close to 10k rows on Heroku Postgres, and I didn’t want to pay for it. :pensive:

(Thanks to Anzo (the future Putnam Fellow :wink:) for looking through this and giving some of his opinions. Check out his blog here!)