R logo
A language and environment for statistical computing and graphics

What is R?

R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, ...) and graphical techniques, and is highly extensible.
R is a tool in the Languages category of a tech stack.

180 companies reportedly use R in their tech stacks, including Instacart, Zalando, and Thumbtack.

612 developers on StackShare have stated that they use R.

Eric Colson
Eric Colson
Chief Algorithms Officer at Stitch Fix
at Stitch Fix
Amazon EC2 Container Service
Apache Spark
Amazon S3

The algorithms and data infrastructure at Stitch Fix is housed in #AWS. Data acquisition is split between events flowing through Kafka, and periodic snapshots of PostgreSQL DBs. We store data in an Amazon S3 based data warehouse. Apache Spark on Yarn is our tool of choice for data movement and #ETL. Because our storage layer (s3) is decoupled from our processing layer, we are able to scale our compute environment very elastically. We have several semi-permanent, autoscaling Yarn clusters running to serve our data processing needs. While the bulk of our compute infrastructure is dedicated to algorithmic processing, we also implemented Presto for adhoc queries and dashboards.

Beyond data movement and ETL, most #ML centric jobs (e.g. model training and execution) run in a similarly elastic environment as containers running Python and R code on Amazon EC2 Container Service clusters. The execution of batch jobs on top of ECS is managed by Flotilla, a service we built in house and open sourced (see https://github.com/stitchfix/flotilla-os).

At Stitch Fix, algorithmic integrations are pervasive across the business. We have dozens of data products actively integrated systems. That requires serving layer that is robust, agile, flexible, and allows for self-service. Models produced on Flotilla are packaged for deployment in production using Khan, another framework we've developed internally. Khan provides our data scientists the ability to quickly productionize those models they've developed with open source frameworks in Python 3 (e.g. PyTorch, sklearn), by automatically packaging them as Docker containers and deploying to Amazon ECS. This provides our data scientist a one-click method of getting from their algorithms to production. We then integrate those deployments into a service mesh, which allows us to A/B test various implementations in our product.

For more info:

#DataScience #DataStack #Data

| 2 upvotes · 95 views
at Sequoia Consulting Group

We have decided to make use of R for ML and Shiny for UI. We are debating usage of self hosted shiny server v/s shinyapp . Our Decision to go with R was to do with Sizes of data and availability of tools. R Shiny

What are my other choices for a vectorized statistics language. Professor was pushing SAS Jump (or was that SPSS) with a menu-driven point and click approach. (Reproducibility can still be accomplished, you publish the script generated by all your clicks.) But I want to type everything, great online tutorials for R. I think I made the right pick. R

Connect to database, data analytics, draw diagram. Machine Learning application, and also used Spark-R for big data processing. R

Tino Gehlert
Tino Gehlert
Data Scientist at Viessmann

Visualisation of air quality in various rooms by RShiny (hosted free on shinyapps.io) R

Using MATLAB, you can analyze data, develop algorithms, and create models and applications. The language, tools, and built-in math functions enable you to explore multiple approaches and reach a solution faster than with spreadsheets or traditional programming languages, such as C/C++ or Java.
JavaScript is most known as the scripting language for Web pages, but used in many non-browser environments as well such as node.js or Apache CouchDB. It is a prototype-based, multi-paradigm scripting language that is dynamic,and supports object-oriented, imperative, and functional programming styles.
Fast, flexible and pragmatic, PHP powers everything from your blog to the most popular websites in the world.
Python is a general purpose programming language created by Guido Van Rossum. Python is most praised for its elegant syntax and readable code, if you are just beginning your programming career python suits you best.
HTML5 is a core technology markup language of the Internet used for structuring and presenting content for the World Wide Web. As of October 2014 this is the final and complete fifth revision of the HTML standard of the World Wide Web Consortium (W3C). The previous version, HTML 4, was standardised in 1997.
