Blog posts tagged with 'programming
''
- A Distributed System from scratch, with Scala 3 - Part 3: Job submission, worker scaling, and leader election & consensus with Raft
2025-05-18: Upgrades to Bridge Four, the functional, effectful distributed compute system optimized for embarrassingly parallel workloads: Updating Scala, auto-scaling workers, and implementing leader election with Raft (or half of a replicated state machine).
programmingscaladistributed systemsfunctional programmingcatscats-effecttagless finalhigher kinded typeszioraftbridgefour - Improving my Distributed System with Scala 3: Consistency Guarantees & Background Tasks (Part 2)
2024-02-19: Improving Bridge Four, a simple, functional, effectful, single-leader, multi worker, distributed compute system optimized for embarrassingly parallel workloads by providing consistency guarantees and improving overall code quality (or something like that).
programmingscaladistributed systemsfunctional programmingcatscats-effecttagless finalhigher kinded typesziobridgefour - Building a functional, effectful Distributed System from scratch in Scala 3, just to avoid Leetcode (Part 1)
2023-06-19: Building something that already exist (but worse) so I don't have to think about Leetcode: Bridge Four, a simple, functional, effectful, single-leader, multi worker, distributed compute system optimized for embarrassingly parallel workloads.
programmingscaladistributed systemsfunctional programmingcatscats-effecttagless finalhigher kinded typesziobridgefour - Why I use Linux
2020-12-21: One question I do get in earnest quite frequently is why I put up with running GNU/Linux distributions for development work. An attempt at a simple response.
linuxgnuprogrammingbsdmacwindows - RE: Throw Away Code? Use go, not Python or Rust!
2020-09-26: Responding to an article on using Rust for throw away code and prototyping: Making a case for go over Rust, Python, and perl.
gorustpythonperlgolangprogrammingbenchmarkingperformancedevelopmentlinux - A Data Engineering Perspective on Go vs. Python (Part 2 - Dataflow)
2020-07-06: In Part 2 of our comparison of Python and go from a Data Engineering perspective, we'll finally take a look at Apache Beam and Google Dataflow and how the go SDK and the Python SDK differ, what drawbacks we're dealing with, how fast it is by running extensive benchmarks, and how feasible it is to make the switch
gogolangpythondataflowbeamgoogle cloudgcpsparkbig dataprogrammingbenchmarkingperformance - A Data Engineering Perspective on Go vs. Python (Part 1)
2020-06-11: Exploring golang - can we ditch Python for go? And have we finally found a use case for go? Part 1 explores high-level differences between Python and go and gives specific examples on the two languages, aiming to answer the question based on Apache Beam and Google Dataflow as a real-world example.
gogolangpythondataflowbeamsparkbig dataprogramming - Tensorflow on edge, or – Building a “smart” security camera with a Raspberry Pi
2019-12-09: The amount of time my outdoor cameras are being set off by light, wind, cars, or anything other than a human is insane. Overly cautious security cameras might be a feature, but an annoying one at that...
big datalinuxmachine learningmakerprogrammingpythonraspberrytensorflowvision - How I built a (tiny) real-time Telematics application on AWS
2019-08-07: In 2017, I wrote about how to build a basic, Open Source, Hadoop-driven Telematics application (using Spark, Hive, HDFS, and Zeppelin) that can track your movements while driving, show you how your driving skills are, or how often you go over the speed limit - all without relying on 3rd party vendors processing and using that data on your behalf...
awsbashcloudiotkinesislambdalinuxprogrammingpython - A look at Apache Hadoop in 2019
2019-07-01: In this article, we'll take a look at whether Apache Hadoop still a viable option in 2019, with Cloud driven data processing an analytics on the rise...
awsazurebig datacloudgoogle cloudhivelinuxprogrammingspark - Analyzing Reddit’s Top Posts & Images With Google Cloud (Part 2 - AutoML)
2018-10-27: In the last iteration of this article we analyzed the top 100 subreddits and tried to understand what makes a reddit post successful by using Google’s Cloud ML tool set to analyze popular pictures.
analyticsautomlbig datacloudgoogle cloudgcpmachine learningprogrammingpythontensorflowvision - Analyzing Reddit’s Top Posts & Images With Google Cloud (Part 1)
2018-06-12: In this article (and its successors), we will use a fully serverless Cloud solution, based on Google Cloud, to analyze the top Reddit posts of the 100 most popular subreddits. We will be looking at images, text, questions, and metadata...
analyticsautomlbig datacloudgoogle cloudgcpmachine learningprogrammingpythontensorflowvision - Analyzing Twitter Location Data with Heron, Machine Learning, Google's NLP, and BigQuery
2018-03-18: In this article, we will use Heron, the distributed stream processing and analytics engine from Twitter, together with Google’s NLP toolkit, Nominatim and some Machine Learning as well as Google’s BigTable, BigQuery, and Data Studio to plot Twitter user's assumed location across the US.
analyticsbig datacloudgoogle cloudgcpmachine learningprogramminghbasenlpheronstormjava