We use RabbitMQ because we need to utilize messaging at various different places in our infrastructure. From real time message ingestion, to asynchronously reacting to user actions. It uses AMQP, it's easy to set up and manage and having it installed and set up on our instances prevents vendor lock in issues.
We use Amazon Redshift because it's based on PostgreSQL and allows us to quickly query vast amounts of data for reporting and analytics.
We use Akka because it enables us to implement complex reactive applications. It has great documentation and is easy to work with, especially if you use the Scala programming language.
We use Play because it makes it very easy to write complex REST-ful APIs. It is based on AKKA, so it provides a very good support for AKKA too, making some complicated work very easy to achieve.
We use Hadoop because it allows us to process and store vast amounts of data, which comes in at very high rates too! The whole ecosystem is very well maintained and documented and it suits our needs very well.
We use Python because it makes prototyping very easy. It requires people to write clean code from the very start. It is also the top language of choice for people in data science with some great libraries, which we take advantage of.
We use Java because it allows us to write very high performance web-based applications. It is strongly typed, it has a huge community and there are lots of very good engineers using the language. This helps finding good talent easier.
We use Scala because it's the main language behind Apache Spark. We're now using it outside of the Spark ecosystem, because it runs on the JVM and we can share libraries between Scala and Java. It allows us to write complex code in a very concise manner using functional programming principles.
We use Docker because it's great for prototyping, setting up development environments, etc. It keeps things nicely isolated and makes some complex deployments very nice and simple.
We use Apache Spark because of it's scalability and the fact that it allows us to create complex ETL and aggregation processes on top of vast amounts of data.