Python

Python

Application and Data / Languages & Frameworks / Languages

Decision at Stream about Go, Stream, Python, Yarn, Babel, Node.js, ES6, JavaScript, Languages, FrameworksFullStack

Avatar of nparsons08
DeveloperEvangelist at Stream ·
GoGoStreamStreamPythonPythonYarnYarnBabelBabelNode.jsNode.jsES6ES6JavaScriptJavaScript
#Languages
#FrameworksFullStack

Winds 2.0 is an open source Podcast/RSS reader developed by Stream with a core goal to enable a wide range of developers to contribute.

We chose JavaScript because nearly every developer knows or can, at the very least, read JavaScript. With ES6 and Node.js v10.x.x, it’s become a very capable language. Async/Await is powerful and easy to use (Async/Await vs Promises). Babel allows us to experiment with next-generation JavaScript (features that are not in the official JavaScript spec yet). Yarn allows us to consistently install packages quickly (and is filled with tons of new tricks)

We’re using JavaScript for everything – both front and backend. Most of our team is experienced with Go and Python, so Node was not an obvious choice for this app.

Sure... there will be haters who refuse to acknowledge that there is anything remotely positive about JavaScript (there are even rants on Hacker News about Node.js); however, without writing completely in JavaScript, we would not have seen the results we did.

#FrameworksFullStack #Languages

29 upvotes·84.5K views

Decision at FundsCorner about Zappa, AWS Lambda, SQLAlchemy, Python, Amazon SQS, Node.js, MongoDB Stitch, PostgreSQL, MongoDB

Avatar of jeyabalajis

Recently we were looking at a few robust and cost-effective ways of replicating the data that resides in our production MongoDB to a PostgreSQL database for data warehousing and business intelligence.

We set ourselves the following criteria for the optimal tool that would do this job: - The data replication must be near real-time, yet it should NOT impact the production database - The data replication must be horizontally scalable (based on the load), asynchronous & crash-resilient

Based on the above criteria, we selected the following tools to perform the end to end data replication:

We chose MongoDB Stitch for picking up the changes in the source database. It is the serverless platform from MongoDB. One of the services offered by MongoDB Stitch is Stitch Triggers. Using stitch triggers, you can execute a serverless function (in Node.js) in real time in response to changes in the database. When there are a lot of database changes, Stitch automatically "feeds forward" these changes through an asynchronous queue.

We chose Amazon SQS as the pipe / message backbone for communicating the changes from MongoDB to our own replication service. Interestingly enough, MongoDB stitch offers integration with AWS services.

In the Node.js function, we wrote minimal functionality to communicate the database changes (insert / update / delete / replace) to Amazon SQS.

Next we wrote a minimal micro-service in Python to listen to the message events on SQS, pickup the data payload & mirror the DB changes on to the target Data warehouse. We implemented source data to target data translation by modelling target table structures through SQLAlchemy . We deployed this micro-service as AWS Lambda with Zappa. With Zappa, deploying your services as event-driven & horizontally scalable Lambda service is dumb-easy.

In the end, we got to implement a highly scalable near realtime Change Data Replication service that "works" and deployed to production in a matter of few days!

23 upvotes·112.4K views

Decision at Uploadcare about PostCSS, Preact, Ember.js, React, Python, Django

Avatar of dmitry-mukhin

Simple controls over complex technologies, as we put it, wouldn't be possible without neat UIs for our user areas including start page, dashboard, settings, and docs.

Initially, there was Django. Back in 2011, considering our Python-centric approach, that was the best choice. Later, we realized we needed to iterate on our website more quickly. And this led us to detaching Django from our front end. That was when we decided to build an SPA.

For building user interfaces, we're currently using React as it provided the fastest rendering back when we were building our toolkit. It’s worth mentioning Uploadcare is not a front-end-focused SPA: we aren’t running at high levels of complexity. If it were, we’d go with Ember.js.

However, there's a chance we will shift to the faster Preact, with its motto of using as little code as possible, and because it makes more use of browser APIs. One of our future tasks for our front end is to configure our Webpack bundler to split up the code for different site sections. For styles, we use PostCSS along with its plugins such as cssnano which minifies all the code.

All that allows us to provide a great user experience and quickly implement changes where they are needed with as little code as possible.

22 upvotes·88.4K views

Decision at Uploadcare about AWS Elastic Load Balancing (ELB), Amazon EC2, Python, Tornado

Avatar of dmitry-mukhin

The 350M API requests we handle daily include many processing tasks such as image enhancements, resizing, filtering, face recognition, and GIF to video conversions.

Tornado is the one we currently use and aiohttp is the one we intend to implement in production in the near future. Both tools support handling huge amounts of requests but aiohttp is preferable as it uses asyncio which is Python-native. Since Python is in the heart of our service, we initially used PIL followed by Pillow. We kind of still do. When we figured resizing was the most taxing processing operation, Alex, our engineer, created the fork named Pillow-SIMD and implemented a good number of optimizations into it to make it 15 times faster than ImageMagick

Thanks to the optimizations, Uploadcare now needs six times fewer servers to process images. Here, by servers I also mean separate Amazon EC2 instances handling processing and the first layer of caching. The processing instances are also paired with AWS Elastic Load Balancing (ELB) which helps ingest files to the CDN.

20 upvotes·29.8K views

Decision at Stitch Fix about Amazon EC2 Container Service, Docker, PyTorch, R, Python, Presto, Apache Spark, Amazon S3, PostgreSQL, Kafka, Data, DataStack, DataScience, ML, Etl, AWS

Avatar of ecolson
Chief Algorithms Officer at Stitch Fix ·

The algorithms and data infrastructure at Stitch Fix is housed in #AWS. Data acquisition is split between events flowing through Kafka, and periodic snapshots of PostgreSQL DBs. We store data in an Amazon S3 based data warehouse. Apache Spark on Yarn is our tool of choice for data movement and #ETL. Because our storage layer (s3) is decoupled from our processing layer, we are able to scale our compute environment very elastically. We have several semi-permanent, autoscaling Yarn clusters running to serve our data processing needs. While the bulk of our compute infrastructure is dedicated to algorithmic processing, we also implemented Presto for adhoc queries and dashboards.

Beyond data movement and ETL, most #ML centric jobs (e.g. model training and execution) run in a similarly elastic environment as containers running Python and R code on Amazon EC2 Container Service clusters. The execution of batch jobs on top of ECS is managed by Flotilla, a service we built in house and open sourced (see https://github.com/stitchfix/flotilla-os).

At Stitch Fix, algorithmic integrations are pervasive across the business. We have dozens of data products actively integrated systems. That requires serving layer that is robust, agile, flexible, and allows for self-service. Models produced on Flotilla are packaged for deployment in production using Khan, another framework we've developed internally. Khan provides our data scientists the ability to quickly productionize those models they've developed with open source frameworks in Python 3 (e.g. PyTorch, sklearn), by automatically packaging them as Docker containers and deploying to Amazon ECS. This provides our data scientist a one-click method of getting from their algorithms to production. We then integrate those deployments into a service mesh, which allows us to A/B test various implementations in our product.

For more info:

#DataScience #DataStack #Data

19 upvotes·120.8K views

Decision at Sentry about Rust, Python

Avatar of jtcunning
Operations Engineer at Sentry ·

Sentry's event processing pipeline, which is responsible for handling all of the ingested event data that makes it through to our offline task processing, is written primarily in Python.

For particularly intense code paths, like our source map processing pipeline, we have begun re-writing those bits in Rust. Rust’s lack of garbage collection makes it a particularly convenient language for embedding in Python. It allows us to easily build a Python extension where all memory is managed from the Python side (if the Python wrapper gets collected by the Python GC we clean up the Rust object as well).

18 upvotes·1 comment·32.8K views

Decision at Thumbtack about C, Go, Rust, Python

Avatar of marcoalmeida

One important decision for delivering a platform independent solution with low memory footprint and minimal dependencies was the choice of the programming language. We considered a few from Python (there was already a reasonably large Python code base at Thumbtack), to Go (we were taking our first steps with it), and even Rust (too immature at the time).

We ended up writing it in C. It was easy to meet all requirements with only one external dependency for implementing the web server, clearly no challenges running it on any of the Linux distributions we were maintaining, and arguably the implementation with the smallest memory footprint given the choices above.

15 upvotes·43.1K views

Decision at Uploadcare about PostgreSQL, Amazon DynamoDB, Amazon S3, Redis, Python, Google App Engine

Avatar of dmitry-mukhin

Uploadcare has built an infinitely scalable infrastructure by leveraging AWS. Building on top of AWS allows us to process 350M daily requests for file uploads, manipulations, and deliveries. When we started in 2011 the only cloud alternative to AWS was Google App Engine which was a no-go for a rather complex solution we wanted to build. We also didn’t want to buy any hardware or use co-locations.

Our stack handles receiving files, communicating with external file sources, managing file storage, managing user and file data, processing files, file caching and delivery, and managing user interface dashboards.

At its core, Uploadcare runs on Python. The Europython 2011 conference in Florence really inspired us, coupled with the fact that it was general enough to solve all of our challenges informed this decision. Additionally we had prior experience working in Python.

We chose to build the main application with Django because of its feature completeness and large footprint within the Python ecosystem.

All the communications within our ecosystem occur via several HTTP APIs, Redis, Amazon S3, and Amazon DynamoDB. We decided on this architecture so that our our system could be scalable in terms of storage and database throughput. This way we only need Django running on top of our database cluster. We use PostgreSQL as our database because it is considered an industry standard when it comes to clustering and scaling.

15 upvotes·31K views

Decision at Codecov about Jest, vuex, Python, Vue.js

Avatar of hootener
CTO at Codecov ·

We chose Vue.js at Codecov to replace a front end that was based mostly on server side rendered Python templates, and was getting fairly long in the tooth. The move to Vue.js allowed us to take a more component driven approach to our front end, providing greater flexibility and reuse when creating new pages and refactoring old ones. Another bonus was how easily we could integrate Axios with VueJS for making AJAX calls within Vue.js components and their associated vuex stores. We were also able to easily integrate Vue.js with the Jest testing framework, which allowed to provide test coverage for a front end where none previously existed.

The move to Vue.js has allowed us to be more agile in our front end development by further decoupling our front end from our back end. Additionally, by fully embracing a component-driven approach, we're able to more easily isolate and test functionality, leading to a more readible, maintainable, and extensible front end codebase.

15 upvotes·23.9K views

Decision at Dubsmash about Kubernetes, Amazon EC2, Heroku, Python, ContainerTools, PlatformAsAService

Avatar of tspecht
‎Co-Founder and CTO at Dubsmash ·
KubernetesKubernetesAmazon EC2Amazon EC2HerokuHerokuPythonPython
#ContainerTools
#PlatformAsAService

Since we deployed our very first lines of Python code more than 2 years ago we are happy users of Heroku. It lets us focus on building features rather than maintaining infrastructure, has super-easy scaling capabilities, and the support team is always happy to help (in the rare case you need them).

We played with the thought of moving our computational needs over to barebone Amazon EC2 instances or a container-management solution like Kubernetes a couple of times, but the added costs of maintaining this architecture and the ease-of-use of Heroku have kept us from moving forward so far.

Running independent services for different needs of our features gives us the flexibility to choose whatever data storage is best for the given task.

#PlatformAsAService #ContainerTools

14 upvotes·28.9K views