I have some light reading tonight: Polars User Guide. I basically hate Pandas’s API and hope to use something more nicely designed, and a heck of a lot faster, instead.
📺 I finished watching Slow Horses Season 1 tonight. I loved it and can’t wait for Season 2. I’m wondering if I would like the books that the show is based on. I also wonder if it will get more than two seasons.
I’m uninstalling Visual Studio from my work computer. It feels weird to do it, but I haven’t used it in years. Visual Studio Code killed it for me.
Is there no good and fast way to create an ETL job in Python?
I love using PETL to read, transform, and validate datasets. Its API is simple and straightforward to use, and its documentation, while not perfect, is really good. Unfortunately, PETL’s data processing—especially loading data into SQL Server—is very slow—to the point of being completely unusable for anything but the smallest datasets.
Odo promises to be much faster at data loads, and seems dead-simple to use, but I cannot get it to work. It turns out that the package hasn’t been updated in five years, which is why it will not work with the modern Python version I am using. I just wasted an hour on it and will probably give it up completely.
I can’t find any other ETL tools that will do the data load for me. I have to rely on the SQL Server Import/Export Wizard, which usually works and is fast but is a one-shot thing because I don’t have SQL Server Integration Services at my disposal.
I suppose I will either have to fork Odo, get it working, and figure out how to get my fork of Odo into my project, or I will have to write my own import routine using SQL Alchemy. Both seem difficult to me and not worth the trouble. I’m not the database administrator on my project—I’m supposed to be the data analyst!
Chess is just poker now
The chess cheating scandal du jour is puzzling and fascinating. Matteo Wong’s article in The Atlantic does not really unpack the issue, but instead provide some depth into how computers have changed the game over the past 25 years:
What once seemed magical became calculable; where one could rely on intuition came to require rigorous memorization and training with a machine. Chess, once poetic and philosophical, was acquiring elements of a spelling bee: a battle of preparation, a measure of hours invested.
It is interesting, though understandable from a technological standpoint, that the concern in the 1990s was that a person might help the computer engine cheat. Today, conversely, the concern is that a computer engine (combined with some spycraft tech, it must be said) might help person cheat.
Long-running data feeds are a real thing after all
Over the weekend I wrote a SQL Server stored procedure and some user-defined functions it relies on that cleaned up data and inserted it into a new table. It was the sort of thing that took a short time to write, but an awfully long time to execute and to troubleshoot. It gave me an opportunity to learn more about using SQL Server’s “Include Actual Execution Plan” and “Include Live Query Statistics” features, which helped point me in the direction of creating an index that supposedly sped the whole process up by 30%.
The first version of my load script ran for 9.5 hours, and then bombed because it tried to insert a NULL value into the primary key of a table. Zero rows were inserted. I revised it to create a second version, which eventually got killed by the server (it timed out or over-burdened the database server, I guess) before it completed after over ten hours. Zero rows were inserted. The final version ran yesterday and overnight last night for over 16 hours, which makes it the longest-running data feed I have ever written by over 10 hours. This run was successful. Only 37,000 rows were inserted, but to get the data I had to comb over 45 million records, which is why it took so long.
As an auditor, I have not always believed it when IT would tell me a data feed took all weekend to run. This experience helped me understand than it can happen, especially when it is an ad-hoc job.
🎵 Today’s listen: Autofiction by (The London) Suede. It’s really good! They went “punk” after 30 years. 😀
I wrote a SQL script that queried, cleaned, and loaded data into a table. It failed after 9 hours of runtime because of a NULL value in the source data that I had forgotten about. 🤦♂️
I am still learning new T-SQL operators. Today I discovered CROSS APPLY, which is for joining to a table-valued function (with parameters pulled from the other tables).
🎵 Today’s listen: Asphalt Meadows by Death Cab for Cutie. I’m excited to hear new music from one of my favorite bands from the aughts.
Tableau is a very powerful data visualization tool. The more I use it, though, the weirder it is to me from a UI standpoint.  I find it both easier and harder to use that it’s nearest competitor, Microsoft Power BI.
I coded and tested an update to one of my iOS apps tonight. It has been too long since I touched Xcode. I have been working more evening hours this year than ever before—mainly because I am on dad duty during daytime hours—and don’t get to work on my hobbies as much as I used to.
My son lost his first tooth today! It happened while he was at school, eating an apple. His teacher wrapped it up for us and sent it home with a note.
iOS 16 is very nice! I am enjoying the new Lock Screens and how they tie to the system Focus feature.
My idea to eat healthier on my vacation was poorly conceived. 😅
Whitefield, New Hampshire, band concert. My father-in-law is playing a song he arranged and my kids are dancing and running around the town green.
It is my family’s annual drive off to vacation day. We are all excited. I will be driving pretty much all day.
I am supposed to be packing my tech gear for my vacation, so of course I am updating all my Linux boxes and freeing up space on my file server instead.
Yesterday I wiped my old iPad, which was running iPadOS 16 developer beta 3, and reinstalled iOS 15.6. Of course, an hour after I did so, a new developer beta was released that may have fixed the Music app crashing bugs that were vexing me.
🎵 Today’s listen: Gold by Sister Sparrow. Apple Music labels it an “Alternative” album, but it is blue-eyed soul all the way. My favorite track is “Can’t Get You Out of My Mind”; it’s fierce and fun.
My writing as a child:
Why use a small word when a big word will do?
My writing as an adult:
Why use a big word when a small word will do?
🎵 Today’s listen: firstborn by Nicolle Gaylon. It is a veteran country songwriter’s first album. It is full of autobiographical songs which are, interestingly, chronologically ordered from birth to death.
The Music app on the most recent iPadOS 16 developer’s beta crashes a lot. I regret installing the beta for this reason, but it was on a device I could live without, and I did need to test one of my apps.
I don’t need a better notes app. I need to take better notes.
I want to install Linux on an old PC that I have. After looking at all sorts of reviews and videos showing the various desktops and distros I could try, I came to the conclusion that boring old standard Ubuntu would probably suit me best.