What is XMPP and what are its top alternatives?
XMPP alternatives & related posts
related Firebase posts
This is my stack in Application & Data
My Utilities Tools
Google Analytics Postman Elasticsearch
My Devops Tools
Git GitHub GitLab npm Visual Studio Code Kibana Sentry BrowserStack
My Business Tools
Fontumi focuses on the development of telecommunications solutions. We have opted for technologies that allow agile development and great scalability.
Firebase and Node.js + FeathersJS are technologies that we have used on the server side. Vue.js is our main framework for clients.
Our latest products launched have been focused on the integration of AI systems for enriched conversations. Google Compute Engine , along with Dialogflow and Cloud Firestore have been important tools for this work.
Git + GitHub + Visual Studio Code is a killer stack.
related Socket.IO posts
I use Socket.IO because the application has 2 frontend clients, which need to communicate in real-time. The backend-server handles the communication between these two clients via websockets. Socket.io is very easy to set up in Node.js and ExpressJS.
In the research project, the 1st client shows panoramic videos in a so called cave system (it is the VR setup of our research lab, which consists of three big screens, which are specially arranged, so the user experience the videos more immersive), the 2nd client controls the videos/locations of the 1st client.
related Kubernetes posts
Visual Studio Code worked really well for us as well, it worked well with all our polyglot services and the .Net core integration had great cross-platform developer experience (to be fair, F# was a bit trickier) - actually, each of our team members used a different OS (Ubuntu, macos, windows). Our production deployment ran for a time on Docker Swarm until we've decided to adopt Kubernetes with almost seamless migration process.
After our positive experience of running .Net core workloads in containers and developing Tweek's .Net services on non-windows machines, C# had gained back some of its popularity (originally lost to Node.js), and other teams have been using it for developing microservices, k8s sidecars (like https://github.com/Soluto/airbag), cli tools, serverless functions and other projects...
How Uber developed the open source, end-to-end distributed tracing Jaeger , now a CNCF project:
Distributed tracing is quickly becoming a must-have component in the tools that organizations use to monitor their complex, microservice-based architectures. At Uber, our open source distributed tracing system Jaeger saw large-scale internal adoption throughout 2016, integrated into hundreds of microservices and now recording thousands of traces every second.
Here is the story of how we got here, from investigating off-the-shelf solutions like Zipkin, to why we switched from pull to push architecture, and how distributed tracing will continue to evolve:
related Docker Compose posts
Recently I have been working on an open source stack to help people consolidate their personal health data in a single database so that AI and analytics apps can be run against it to find personalized treatments. We chose to go with a #containerized approach leveraging Docker #containers with a local development environment setup with Docker Compose and nginx for container routing. For the production environment we chose to pull code from GitHub and build/push images using Jenkins and using Kubernetes to deploy to Amazon EC2.
We also implemented a dashboard app to handle user authentication/authorization, as well as a custom SSO server that runs on Heroku which allows experts to easily visit more than one instance without having to login repeatedly. The #Backend was implemented using my favorite #Stack which consists of FeathersJS on top of Node.js and ExpressJS with PostgreSQL as the main database. The #Frontend was implemented using React, Redux.js, Semantic UI React and the FeathersJS client. Though testing was light on this project, we chose to use AVA as well as ESLint to keep the codebase clean and consistent.
I've been recently getting really into home automation- you know, making my house Smart™, which basically means half the time my lights don't turn on and the other half of the time apparently my kitchen faucet needs a static IP address.
But it's been a blast! It's a fun way to write code for yourself, outside of work, to have an impact in the real world. It's a nice way of falling in love with a different side of programming again.
I've used Apple's HomeKit for awhile, since we're pretty all-in in Apple devices at home, but the rough edges have been grating at me more and more. HomeKit is so opaque- you can't see what's wrong, why a device is unresponsive, and most importantly: the compatibility isn't there. HomeKit has a limited selection of — more expensive — accessories, and as you go beyond just simple LED lights, you want a bit more power. Also, we're programmers, dammit, gimme all the things.
Anyway, I've switched to Home Assistant the last few months, and I'm kicking myself I didn't make the switch earlier. As a programmer, it's great: you get the most capability than pretty much any other smart home platform (integrations have been written for most devices and technologies out there today), it's easier to debug, and when you want to go bigger than just simple lights on/off, HA has some really powerful stuff behind it.
I use Home Assistant in conjunction with Docker and Docker Compose; since the config is extracted out, upgrades are usually as easy as a pull of the latest version. I've just started digging into writing integrations for a lesser-used device that I have at home, and HA makes it pretty straightforward to just magically add it to the home network.
It plays well with others, too- we require a VPN connection in to the home network to access our Home Assistant install, and HA has a few tricks to help with that (ignoring the VPN route if you're on a local network, etc). Nice client support for iOS and Android, too.
Anyway, big fan of Home Assistant if you want to go beyond simple home automations and setup. Wish I would have done it a lot earlier. Also, big fan of jumping into all this if you have the time and interest to do so- it's been tickling a different part of my code brain than I've had access to in awhile, and that's been fun in and of itself.
related RabbitMQ posts
As Sentry runs throughout the day, there are about 50 different offline tasks that we execute—anything from “process this event, pretty please” to “send all of these cool people some emails.” There are some that we execute once a day and some that execute thousands per second.
Managing this variety requires a reliably high-throughput message-passing technology. We use Celery's RabbitMQ implementation, and we stumbled upon a great feature called Federation that allows us to partition our task queue across any number of RabbitMQ servers and gives us the confidence that, if any single server gets backlogged, others will pitch in and distribute some of the backlogged tasks to their consumers.
We've been using RabbitMQ as Zulip's queuing system since we needed a queuing system. What I like about it is that it scales really well and has good libraries for a wide range of platforms, including our own Python. So aside from getting it running, we've had to put basically 0 effort into making it scale for our needs.
However, there's several things that could be better about it:
* It's error messages are absolutely terrible; if ever one of our users ends up getting an error with RabbitMQ (even for simple things like a misconfigured hostname), they always end up needing to get help from the Zulip team, because the errors logs are just inscrutable. As an open source project, we've handled this issue by really carefully scripting the installation to be a failure-proof configuration (in this case, setting the RabbitMQ hostname to
127.0.0.1, so that no user-controlled configuration can break it). But it was a real pain to get there and the process of determining we needed to do that caused a significant amount of pain to folks installing Zulip.
pika library for Python takes a lot of time to startup a RabbitMQ connection; this means that Zulip server restarts are more disruptive than would be ideal.
* It's annoying that you need to run the
rabbitmqctl management commands as root.
But overall, I like that it has clean, clear semanstics and high scalability, and haven't been tempted to do the work to migrate to something like Redis (which has its own downsides).
related Kafka posts
The algorithms and data infrastructure at Stitch Fix is housed in #AWS. Data acquisition is split between events flowing through Kafka, and periodic snapshots of PostgreSQL DBs. We store data in an Amazon S3 based data warehouse. Apache Spark on Yarn is our tool of choice for data movement and #ETL. Because our storage layer (s3) is decoupled from our processing layer, we are able to scale our compute environment very elastically. We have several semi-permanent, autoscaling Yarn clusters running to serve our data processing needs. While the bulk of our compute infrastructure is dedicated to algorithmic processing, we also implemented Presto for adhoc queries and dashboards.
Beyond data movement and ETL, most #ML centric jobs (e.g. model training and execution) run in a similarly elastic environment as containers running Python and R code on Amazon EC2 Container Service clusters. The execution of batch jobs on top of ECS is managed by Flotilla, a service we built in house and open sourced (see https://github.com/stitchfix/flotilla-os).
At Stitch Fix, algorithmic integrations are pervasive across the business. We have dozens of data products actively integrated systems. That requires serving layer that is robust, agile, flexible, and allows for self-service. Models produced on Flotilla are packaged for deployment in production using Khan, another framework we've developed internally. Khan provides our data scientists the ability to quickly productionize those models they've developed with open source frameworks in Python 3 (e.g. PyTorch, sklearn), by automatically packaging them as Docker containers and deploying to Amazon ECS. This provides our data scientist a one-click method of getting from their algorithms to production. We then integrate those deployments into a service mesh, which allows us to A/B test various implementations in our product.
For more info:
- Our Algorithms Tour: https://algorithms-tour.stitchfix.com/
- Our blog: https://multithreaded.stitchfix.com/blog/
- Careers: https://multithreaded.stitchfix.com/careers/
#DataScience #DataStack #Data
As we've evolved or added additional infrastructure to our stack, we've biased towards managed services. Most new backing stores are Amazon RDS instances now. We do use self-managed PostgreSQL with TimescaleDB for time-series data—this is made HA with the use of Patroni and Consul.
We also use managed Amazon ElastiCache instances instead of spinning up Amazon EC2 instances to run Redis workloads, as well as shifting to Amazon Kinesis instead of Kafka.