C++ vs R: What are the differences?
What is C++? Has imperative, object-oriented and generic programming features, while also providing the facilities for low level memory manipulation. C++ compiles directly to a machine's native code, allowing it to be one of the fastest languages in the world, if optimized.
What is R? A language and environment for statistical computing and graphics. R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, ...) and graphical techniques, and is highly extensible.
C++ and R belong to "Languages" category of the tech stack.
"Performance" is the top reason why over 146 developers like C++, while over 58 developers mention "Data analysis " as the leading cause for choosing R.
Lyft, OkCupid, and Twitch are some of the popular companies that use C++, whereas R is used by AdRoll, Instacart, and Verba. C++ has a broader approval, being mentioned in 199 company stacks & 371 developers stacks; compared to R, which is listed in 128 company stacks and 97 developer stacks.
What is C++?
What is R Language?
Need advice about which tool to choose?Ask the StackShare community!
Sign up to add, upvote and see more prosMake informed product decisions
Sign up to add, upvote and see more consMake informed product decisions
Sign up to get full access to all the companiesMake informed product decisions
Sign up to get full access to all the tool integrationsMake informed product decisions
Ruby NLP C++ Grammar #BNF
At FriendlyData we had a Ruby-based pipeline for natural language processing. Our technology is centered around grammar-based natural language parsing, as well as various product features, and, as the core stack of the company historically is Ruby, the initial version of the pipeline was implemented in Ruby as well.
As we were entering the exponential growth phase, both technology- and product-wise, we looked into how could we speed up and extend the performance and flexibility of our [meta-]BNF-based parsing engine. Gradually, we built the pieces of the engine in C++.
Ultimately, the natural language parsing stack spans three universes and three software engineering paradigms: the declarative one, the functional one, and the imperative one. The imperative one was and remains implemented in Ruby, the functional one is implemented in a functional language (this part is under the NDA, while everything I am talking about here is part of the public talks we gave throughout 2017 and 2018), and the declarative part, which can loosely be thought of as being BNF-based, is now served by the C++ engine.
The C++ engine for the BNF part removed the immediate blockers, gave us 500x+ performance speedup, and enabled us to launch new product features, most notably query completions, suggestions, and spelling corrections.
How Uber developed the open source, end-to-end distributed tracing Jaeger , now a CNCF project:
Distributed tracing is quickly becoming a must-have component in the tools that organizations use to monitor their complex, microservice-based architectures. At Uber, our open source distributed tracing system Jaeger saw large-scale internal adoption throughout 2016, integrated into hundreds of microservices and now recording thousands of traces every second.
Here is the story of how we got here, from investigating off-the-shelf solutions like Zipkin, to why we switched from pull to push architecture, and how distributed tracing will continue to evolve:
Maybe not in everybody focus but I do like programming for @z/OS, @z/Linux and @z/VM with C++ , Java and Assembler . Who else love to dig into control blocks and get a deep dive into system resources to run things in a high valuable way ? And also go all the way up to the application to enlight all the infrastructure features to it ?
Initially, I wrote my text adventure game in C++, but I later rewrote my project in Rust. It was an incredibly easier process to use Rust to create a faster, more robust, and bug-free project.
One difficulty with the C++ language is the lack of safety, helpful error messages, and useful abstractions when compared to languages like Rust. Rust would display a helpful error message at compile time, while C++ would often display "Segmentation fault (core dumped)" or wall of STL errors in the middle. While I would frequently push buggy code to my C++ repository, Rust enabled me to only even submit fully functional code.
Along with the actual language, Rust also included useful tools such as rustup and cargo to aid in building projects, IDE tooling, managing dependencies, and cross-compiling. This was a refreshing alternative to the difficult CMake and tools of the same nature.
The algorithms and data infrastructure at Stitch Fix is housed in #AWS. Data acquisition is split between events flowing through Kafka, and periodic snapshots of PostgreSQL DBs. We store data in an Amazon S3 based data warehouse. Apache Spark on Yarn is our tool of choice for data movement and #ETL. Because our storage layer (s3) is decoupled from our processing layer, we are able to scale our compute environment very elastically. We have several semi-permanent, autoscaling Yarn clusters running to serve our data processing needs. While the bulk of our compute infrastructure is dedicated to algorithmic processing, we also implemented Presto for adhoc queries and dashboards.
Beyond data movement and ETL, most #ML centric jobs (e.g. model training and execution) run in a similarly elastic environment as containers running Python and R code on Amazon EC2 Container Service clusters. The execution of batch jobs on top of ECS is managed by Flotilla, a service we built in house and open sourced (see https://github.com/stitchfix/flotilla-os).
At Stitch Fix, algorithmic integrations are pervasive across the business. We have dozens of data products actively integrated systems. That requires serving layer that is robust, agile, flexible, and allows for self-service. Models produced on Flotilla are packaged for deployment in production using Khan, another framework we've developed internally. Khan provides our data scientists the ability to quickly productionize those models they've developed with open source frameworks in Python 3 (e.g. PyTorch, sklearn), by automatically packaging them as Docker containers and deploying to Amazon ECS. This provides our data scientist a one-click method of getting from their algorithms to production. We then integrate those deployments into a service mesh, which allows us to A/B test various implementations in our product.
For more info:
- Our Algorithms Tour: https://algorithms-tour.stitchfix.com/
- Our blog: https://multithreaded.stitchfix.com/blog/
- Careers: https://multithreaded.stitchfix.com/careers/
#DataScience #DataStack #Data
At FlowStack we write most of our backend in Go. Go is a well thought out language, with all the right compromises for speedy development of speedy and robust software. It's tooling is part of what makes Go such a great language. Testing and benchmarking is built into the language, in a way that makes it easy to ensure correctness and high performance. In most cases you can get more performance out of Rust and C or C++, but getting everything right is more cumbersome.
What are my other choices for a vectorized statistics language. Professor was pushing SAS Jump (or was that SPSS) with a menu-driven point and click approach. (Reproducibility can still be accomplished, you publish the script generated by all your clicks.) But I want to type everything, great online tutorials for R. I think I made the right pick.
C++ is used in Shiro (https://github.com/Marc3842h/shiro).
C++ is a high performance, low level programming language. Game servers need to run with fast performance to be able to reliably serve players across the globe.
The most latency sensitive parts are written in C++. Due to our interconnected services architecture, we use either Python or C++ for each service, with the performance critical parts being C++14.
Used to write PHP extensions - AZTEC Decoder - License Driver scan - Axis2/C to PHP wrapper and Job-scheduler - Barbershop
Performance, zero-overhead abstractions and memory safety of the modern C++ language make this the perfect language for the project.
Connect to database, data analytics, draw diagram. Machine Learning application, and also used Spark-R for big data processing.
The main programming language of ApertusVR. C++11 & CMake provides multi-platform targeting. The architecture is modular.
Visualisation of air quality in various rooms by RShiny (hosted free on shinyapps.io)