Christian Hollinger

Software Engineering, GNU/Linux, Data, GIS, and other things I like

25 Feb 2020

How a broken memory module hid in plain sight

1,561 words, ~6 min read

How a broken memory module hid in plain sight – and how I blamed the Linux Kernel and two innocent hard drives
09 Dec 2019

Tensorflow on edge, or – Building a “smart” security camera with a Raspberry Pi

1,886 words, ~7 min read

The amount of time my outdoor cameras are being set off by light, wind, cars, or anything other than a human is insane. Overly cautious security cameras might be a feature, but an annoying one at that...
07 Aug 2019

How I built a (tiny) real-time Telematics application on AWS

2,938 words, ~11 min read

In 2017, I wrote about how to build a basic, Open Source, Hadoop-driven Telematics application (using Spark, Hive, HDFS, and Zeppelin) that can track your movements while driving, show you how your driving skills are, or how often you go over the speed limit - all without relying on 3rd party vendors processing and using that data on your behalf...
01 Jul 2019

A look at Apache Hadoop in 2019

2,380 words, ~9 min read

In this article, we'll take a look at whether Apache Hadoop still a viable option in 2019, with Cloud driven data processing an analytics on the rise...
11 Apr 2019

Building a Home Server

2,053 words, ~8 min read

In this article, I’ll document my process of building a home server - or NAS - for local storage, smb drives, backups, processing, git, CD-rips, and other headless computing...
27 Oct 2018

Analyzing Reddit’s Top Posts & Images With Google Cloud (Part 2 - AutoML)

2,035 words, ~8 min read

In the last iteration of this article we analyzed the top 100 subreddits and tried to understand what makes a reddit post successful by using Google’s Cloud ML tool set to analyze popular pictures.
12 Jun 2018

Analyzing Reddit’s Top Posts & Images With Google Cloud (Part 1)

2,096 words, ~8 min read

In this article (and its successors), we will use a fully serverless Cloud solution, based on Google Cloud, to analyze the top Reddit posts of the 100 most popular subreddits. We will be looking at images, text, questions, and metadata...
18 Mar 2018

Analyzing Twitter Location Data with Heron, Machine Learning, Google's NLP, and BigQuery

3,060 words, ~12 min read

In this article, we will use Heron, the distributed stream processing and analytics engine from Twitter, together with Google’s NLP toolkit, Nominatim and some Machine Learning as well as Google’s BigTable, BigQuery, and Data Studio to plot Twitter user's assumed location across the US.
04 Nov 2017

Data Lakes: Some thoughts on Hadoop, Hive, HBase, and Spark

4,069 words, ~16 min read

This article will talk about how organizations can make use of the wonderful thing that is commonly referred to as “Data Lake” - what constitutes a Data Lake, how probably should (and shouldn’t) use it to gather insights and why evaluating technologies is just as important as understanding your data...
06 Mar 2017

(Tiny) Telematics with Spark and Zeppelin

2,222 words, ~8 min read

How I made an old Crown Victoria "smart" by using Telematics...