Friday, July 12, 2013

Trusting the API is not a good idea

You are programming right along and need to accomplish what appears to be a simple task. You look at the API used by the company and find a one line call to do what you need. Make the call, get the expected results and move on to the next task. All rainbows and unicorns right?

Not so fast there bucko! The key word here is fast. The method I called did do what it needed to do and returned nearly instantly because I hardly had any data in my test database. Once you had a large amount of data it was painfully slow as we found out when attempting to run it at a client site. I had no logging output around the call as it was never an issue for me. It just looked like the program froze when running at the client location.

When something breaks I blame myself. I figure it must be in my code. Took another developer looking over my shoulder to point out "That call might be slow". I had looked past that line over and over again. Of course when I ran the program it sailed right past that line so I never thought about it. I was blinded to it because I was sure I was the one to blame.

We were able to have the client run another command against the database to clear out some old data to get them past that point. We then found the next step, again relying on the API, was also very slow but I had logging in place to watch that progress.

Time to rethink how all of this needed to work. We fixed the initial call to work in a reasonable amount of time and are in the process of gutting the code to write records to a file which we will then import via "copy {tablename} from {outputfile}" x 4 instead of doing inserts in the code. We were able to import 1.2 million records in 1 minute and 46 seconds using the 3 copy calls where it was going to take days to do it in code one insert at a time while parsing JSON streams.

This particular client has a much larger dataset than any other client. We are talking to a 3rd party vendor to get the 8.2 million records that need to be inserted via JSON requests. After these code changes things will run faster and scale better for every client. My part of the code is the JSON to our database format conversion. So far that appears to be working properly at least.

I made the mistake of trusting the existing API calls. I have only been at the company for 4 months meaning I had no background in company lore. I had no ideas what areas had bitten folks in the behind in the past and I figured the every call was optimized. Trust no code especially "this one line can't do much or take long". I should have put logging messages above and below the API call so I could have spotted this in the client run quicker. We knew up front they had a ton of data. We knew we did not have a large sample for testing. This is where extra logging is really helpful.

Lesson learned. I hope I don't make it again.