The HyperDev Tech Stack: Powering Over 1M Containers

776
Glitch
Developer playground for building full-stack web apps, fast

Editor's note: By Gareth Wilson, Member of Technical Staff at HyperDev and FogBugz


About Fog Creek and HyperDev

Fog Creek Software, founded in 2000 by Joel Spolsky and Michael Pryor, are makers of FogBugz, Trello, and co-creators of Stack Overflow. We recently launched the open beta of HyperDev, a developer playground for quickly building full-stack web apps. It removes all setup, so you only have to worry about writing code to build a web app. Or, as one user quipped - ‘let's crowdsource a million containers’. Something we’ve done within just a few weeks of launch.



In HyperDev, the apps you create are instantly live and hosted by us. They’re always up to date with your latest changes because changes are deployed as you type them. You can invite teammates to your project so that you can collaborate on code together and see changes as they’re made. To get started quickly you’re able to remix existing community projects, and every project gets a URL for editing and viewing, so you can share your code or your creations.

But perhaps the most interesting thing about HyperDev, is what you don’t have to do to get a real, fully-functional web app running:

  • You don’t need to make an account.
  • You don’t need to configure and setup a web server.
  • You don’t need to sign up with a host and wait for name servers to update.
  • You don’t need to install an operating system, LAMP, Node or anything else.
  • You don’t need to commit, build, or deploy your code.

It takes care of all that for you, so you can just focus on writing code. This makes it a great option for those just learning to code, so you can avoid some of the complexity whilst you’re just getting up to speed. But it’s useful for more experienced developers too, who are looking to quickly bang out some code and create a product quickly to get feedback.

Engineering at Fog Creek

At Fog Creek, Engineering is split into two inter-disciplinary product teams: FogBugz and HyperDev. Members work across the full stack, QA and Testing. The majority of staff work on FogBugz, but more are being added to HyperDev as it develops. Currently, it’s a team of 8 contributing to HyperDev, of which 5 are full-time - 3 working on the back-end and 2 on the front-end. With additional support for project management, marketing and system administration.

The HyperDev Tech Stack

About HyperDev

There are 3 main parts to HyperDev: the collaborative text editor, the hosted environment your app runs in, and the quick deploy of code from the editor to that environment. We've found that we need to get code changes updated in the app in under 1.5 seconds. Otherwise, the experience doesn't feel fluid, and you end up longing for your local dev setup. What's more, we need to do this at scale – we think millions of developers, and would-be developers, can benefit from HyperDev, and already we've seen hundreds of thousands of developers try things out, so it needs to be able to do all of it for thousands of projects at once.

The Frontend Client

To deliver this, the frontend client is a serverless, single-page app that's composed of compiled JavaScript, HTML, and CSS. It's served from an AWS S3 bucket, that's fronted by CloudFront CDN, and we use Route 53 for DNS. This setup allows us to serve the files quickly, in a scalable way, with low latency from most locations. Once the app is loaded it then talks to our Orchestration API (OAPI).

The frontend editor client app is built with Node.js and is written in CoffeeScript. We use Stylus for our stylesheets and Hamlet.coffee for reactive templating. Browserify then compiles, and assembles the dependencies, so it's all packaged up into a single JavaScript file, which we minify using Uglify and then it's gzipped.

The choice of CoffeeScript was mostly due to familiarity and a preference within the team for its minimal code aesthetic. You might not be familiar with Hamlet.coffee - it's the creation of one our team members, Daniel X Moore. It nicely solves the problem of facilitating the use of CoffeeScript with a Jade-like syntax for reactive templating, without having to resort to hacks, like we had to when using Knockout and Backbone.

Within the app we also have npm embedded, so you can search and select packages to include in your package.json file directly. The search and metadata used for that are provided via use of an API service called Libraries.io.

Client Architecture

From the outset, we decided not to write our own editor. In general, online editors are difficult to write and although there are complexities in depending on someone else’s design for an editor, we knew that the editor was not our core value. So we chose to use Ace and hook in our own minor modifications to interface into our editor model.

For collaborative editing we use Operation Transforms (OT), to allow edits in 2 or more instances of the same document to be applied across all the clients and the back-end irrespective of the order they are generated and processed. To help jumpstart our implementation of OT on the frontend we used a fork of Firepad under the Ace editor, which uses the OT.js lib internally. This was then interfaced into our own model of the documents and app Websocket implementation.

Using Firepad and Ace was a real boon whilst pushing towards our MVP, as it meant we could direct our dev resources elsewhere and we could leverage the established themes and plugins built upon Ace.



The Backend Orchestration API

We use AWS for our backend, which meant we didn’t have to commit too early on any given part of the stack from Hardware, through to OS and infrastructure services.

We knew we wanted to provide multi-language support in HyperDev, even though at beta release we'd just be offering Node.js initially. So when it comes to handling users' code, the proxies that accept the requests from the frontend client, processes them and orchestrates the client’s running code is all written in Go.

We chose Go because it is strong in concurrent architectures, has powerful primitives and robust HTTP handling. In addition, several of our stack components were written natively in Go which gave us confidence in the client APIs we would need. Go also had the benefit of being a good standalone binary generator so our dependencies would be minimal once we had the binary compiled for the appropriate architecture.

Backend Architecture

The Proxies

On the frontend, our proxies have a health endpoint that is pulled out of Route 53 DNS if they fail. These are distributed across our AWS availability zones. It's the responsibility of the proxies, written in Go, to route traffic to either an existing available instance of the user’s project or to place it in a backend node and route to that.

Since all the frontend proxies needed to know the state of project placement, which was fluid over time, we decided to experiment with etcd. Each of the proxies is a node in the etcd cluster so that it has a local copy of state. We were then able to compare and swap atomic changes to consistently route to the right backend instance. However, as we ramped up in the early beta we noticed that there would be periodic hangs in servicing the requests. It turned out that because etcd uses a log appending algorithm, then after a few thousand changes it needs to “flatten” through snapshots its view on the data. So our increasingly busy set of user projects would then trigger this regular flattening of the database, which led to the hangs. So for now, we’ve moved over to PostgreSQL for state handling.

The Container Servers

A user’s application is sandboxed in a Docker container running on AWS EC2 instances. We chose Docker due to its strong API and documentation. An orchestration service then needs to coordinate the content on the disk, content changes with the editor, the Docker containers used for installation and running the user’s code, and the returning of all the necessary logs back to the user’s editor.

The challenge here is that some parts of the architecture needed to be fast, with low-latency exchanging of messages between the components, and others needed to handle long-running, blocking events such as starting a user’s application. To get around this, we used a messaging hub and spoke model. The hubs were non-blocking event loops that would listen and send on Go channels. The spokes would reflect the single instances of a project’s content with OT support or container environment via the Docker APIs. This architecture has worked well and enabled us in the early days to split the proxies off from the container servers without too much effort, and a messaging approach lends itself to decoupling components as needs arise.

Post-launch as we scaled up, a few issues arose as we ran into a number of Kernel bugs. So we tried out several OS and Docker version combinations, and in the end settled on Ubuntu Xenial with Docker. This works well for stability under load.

Overall, this part of the system has proven quite difficult to maintain. There’s opportunity to simplify things and leverage Docker Swarm, so we’re in the process of moving over to that. We’ll also be re-evaluating whether Amazon’s ECS can help too, though it may well prove to be an unnecessary layer of complexity over Docker Swarm.

Using HyperDev to Power HyperDev

We also use a number of HyperDev hosted elements as part of our backend services. That way we're always dogfooding our own product. This is important for us, as if we want people to trust and rely on HyperDev for their projects, then we should be happy do the same for our own too. This includes our authentication and authorization services, which are the first services used once the frontend is running. So any problems on the backend are customer-facing and immediately impact a user’s experience of the product. This has caused some growing pains over using more mature, battle-hardened options. But it has meant we’ve been focused on reliability from the outset.

Visibility and Reporting

For tracking event flows in the front-end we use Google Analytics, which gives us both reporting on specific events users take within the app and a 10,000ft view of the overall activity trends in the app and on the website. We also use New Relic to get an overview of the performance of our systems and application.

For the backend; with any system that crosses multiple system boundaries in a single transaction, it is important to keep a visibility on what is going on. In the early days, system logs worked ok. But as the number of systems goes above a few, mixed in with random placement of projects, it became important to stream the logs off the server. We chose Loggly for this. The wins we got were that we weren’t filling up disks with debug logs, we could filter logs that crossed multiple systems, and with well-formatted logs, we could generate charts and reports.

Project Management

To keep us organized, and to plan and prioritize upcoming work, we use FogBugz. We previously used Trello for this, but as the number of items grew it became easier to manage this with FogBugz. We also use Google Docs for getting feedback on new feature ideas and marketing plans etc.

Continuous Integration and Deployment

One of the core development principles of the team is deploy often, and expose failures fast. We could not achieve that without a continuous integration and deployment pipeline. Once the code is checked into the GitHub repository, Travis CI then kicks off an integration flow for all branches and Pull Requests. Any problems are rapidly identified and injected into our #Dev channel in Slack.

Travis CI enables us to run unit tests with Mocha, enforce coverage using Istanbul, run any compile and packaging steps, and if everything passes in our deployment channel push to staging or production as appropriate.

Travis also allows us to run long running tests while we continue to work. These tests included coverage tests and race condition checking. The latter exposed some issues in our code structure that we were grateful to know upfront, because there is nothing worse than trying to debug an unexpectedly failing process across unknown numbers of servers when it hits a race condition.

And lastly, Travis allows us to compile the binaries, upload them to a repository tagged with the Git commit string. This means that downstream in staging and production we were using the same binary, pulled from the repository, that had passed the tests in Travis.

This approach has been valuable for us as it allows us to focus on developing code instead of deploying. It has forced us to maintain test coverage levels from day 1, and we know that if something does go wrong in production then we have a nimble and predictable deployment pipeline.

With the OAPI needing to interact with so many components and our need to create repeatable and reliable stacks in development, staging, and production, we’ve spent more time on repeatability rather than speed of deployment. So from day 1 we started codifying the stack in Ansible. This was great because our only dependency was SSH, and we could just as easily run this against our development environment in Vagrant, as well as staging and production in AWS. The downside to the approach is that it feels like we’re behind the curve in terms of speed of the deploys because we have something that works, albeit slower than we might want.

The Future

We have big plans for HyperDev, with a number of major features in the pipeline. First up, will be support for multiple languages. At the moment it's just Node.js with JavaScript, so we're keen to open that up. The first ones will likely be Ruby, Python, Perl, and other dynamic web languages. Due to the language-agnostic way we approached building out the backend from the outset, this doesn't require significant backend work. It's mostly work on the front-end, optimizing the experience for specific languages such as integrating different package managers.

Overall, we’re happy with our stack. We’ve had to learn a number of lessons quickly as our launch brought more than 3 times the number of users we had anticipated (but that’s a nice problem to have!) However, no early-stage stack is perfect and we’re continuing to refine and try different options as we continue to scale up, improve speed and performance of the service and deliver the rock-solid reliability our users deserve.

So next time you want to write a quick script or prototype a new product, then remember to give HyperDev a try.

Check out the HyperDev Stack.

Glitch
Developer playground for building full-stack web apps, fast
Tools mentioned in article
Customer Success Engineer
New York City, NY
We're looking for a Customer Success Engineer to join our team and support Glitch development across our growing new and established partnerships. Alongside the Head of Partnerships, this role will be key in supporting Glitch's growth in terms of users, brand awareness, and financial partnerships. In this role, you will have the opportunity to grow with Glitch, as it is on the cusp of major new developments and help build our community through interesting and supportive new partnerships. Glitch is looking to partner with tech companies, educational organizations, nonprofits, and beyond. We believe these organizations can benefit from using Glitch as well as represent our vision for the Glitch Community. To support these partnerships, you'll need the technical skill and experience to understand the scope of work and feasibility with Glitch. You'll also help on some internal projects supporting the expansion of Glitch from a business side, such as helping with customer tracking and analytics. 
  • Determine technical feasibility and scope of partnership projects
  • Act partly as an account manager for both new and established partners
  • Ensure the success of partnerships, including some internal and external project managing
  • Collaborate with high priority Glitch Teams users and potential users to ensure conversion and provide feedback to the internal Glitch team on use
  • Join in pitch meetings and intro calls to find new opportunities with companies to deepen engagement with the Glitch platform
  • Create compelling apps alongside partners that educate and inspire people to build things themselves
  • A broad technical understanding giving you the ability to convey technical topics to an educated, detail-oriented audience, in a clear and concise way.
  • A friendly, engaging manner with experience networking and effectively building and maintaining working relationships.
  • Solid front-end coding skills, with experience using JavaScript and Node.js
  • Understanding of and experience in using APIs, both internal and 3rd party.
  • Self-direction, you're happy to set and meet deadlines.
  • Comfort and experience in trying new ideas for partnerships, events, and more. As Glitch grows, we're interested in building strong partnerships with like-minded organizations and will need support in iterating on the best way to support our various constituents.
  • Experience managing technical partnerships, especially with developer relations & operation teams
  • Comments
    Open jobs at Glitch
    Community Health Engineer
    New York City, NY
    About the role We're looking for an experienced and community-focused front-end web developer to join our team and help us build the features on Glitch that will help keep it the friendliest community of coders ever! ✨ Our Community Health Engineer will work with the rest of the team to brainstorm, prototype and implement community health features on Glitch, like reporting abuse or displaying safe search results. This role requires both technical skills and the ability to communicate with the Glitch community. We want someone who will be actively thinking about the effects of our design and features on the community as we grow, and can interact with the community to make sure we’re doing it right.
  • Strong skills in client-side JavaScript, but you’re okay with diving into Node.js
  • An ability to stay up-to-date on modern additions to the language (like ES7 features)
  • Experience in building a11y-compliant interfaces right from the start
  • Proficiency in component-based development (Glitch uses React!)
  • Understanding of and experience in using APIs, both internal and 3rd party
  • CSS savvy, meaning that you know when to use border vs. padding vs. margin, flexbox vs. tables vs. floats, and responsive, fluid design
  • User interface (UI) and user experience (UX) skills – enough to better collaborate with the designers on our team to make sure our health features are inclusive and relevant to our community health goals.
  • Experience and/or interest in speaking at technical conferences – so we can communicate our work to the tech community and be a model for others in a similar space.
  • A passion for online communities and an understanding of the interplay between interfaces and human interactions.
  • Effective, responsive and frequent multi-platform communication to make sure we hear and deliver on the needs of the Glitch community
  • Ability to translate, communicate, and prioritize the needs and desires of the community with the needs and desires of the organization.
  • Enthusiasm for our inclusion goals and an ability to help us measure and promote our progress in working towards them.
  • Willingness to help us create a roadmap for creating and maintaining a best-in-class healthy community as we grow and evolve.
  • People who care deeply about empowering everyone to create on the web, and who are thoughtful about how to serve a broad set of audiences with sometimes conflicting community goals.
  • Fluency and familiarity with issues around multiple facets of accessibility and healthy communities.
  • Comfort with creating in collaboration, both with colleagues on our team and with members of the Glitch community. Ability to both handle projects and effectively delegate projects.
  • Developer Advocate
    New York City, NY
    About the role We're looking for a Developer Advocate to join our team and help us grow the Glitch community. You'll create high-quality, practical content that will help teach development teams how to get the most out of Glitch by highlighting best practices. Through live-coding, creating starter projects and social media engagement, you'll help establish Glitch as the go-to place to try out new libraries, frameworks and APIs. By building relationships and working with technology communities, you'll help them leverage Glitch through remixable example apps and embeds. And you'll connect their feedback with the Glitch product teams to ensure that we're creating a great developer experience. Everyone involved in Glitch helps out with support as it makes sure we're keeping in touch with our end users. But our team takes the lead and helps make sure our customers are happy and we live up to our billing of 'the friendly community.' By responding to topics on our forum, inbound emails and questions on social media, you'll help ensure our users are successful and have fun.
  • A friendly, engaging manner with experience networking and effectively building and maintaining working relationships.
  • A broad technical understanding giving you the ability to convey technical topics to an educated, detail-oriented audience, in an authentic and unique way.
  • Solid full-stack coding skills, with experience of using JavaScript and Node.js, creating compelling apps that educate and inspire people to build things themselves. 
  • Understanding of and experience in using APIs, both internal and 3rd party.
  • An excellent command of English, including top-notch copy and long-form writing skills.
  • Self-direction, you're happy to set and meet deadlines.
  • Knowledge of the developer tools ecosystem
  • Comfortability editing images as well as video
  • Comfortability speaking with small and large audiences about technical topics. Though speaking at events won't be a significant part of the role, some travel and public speaking may be required from time to time.
  • DevOps Engineer
    New York City, NY
    About the role We’re looking for an experienced DevOps Engineer to join the Glitch Platform team and focus on building a reliable and secure infrastructure for our rapidly growing user base. This role will be key as we scale Glitch while ensuring quality performance of all projects on the platform. You'll be joining a hybrid team of operations engineers and developers, so if you're excited or intrigued by spending part of your day writing code and fixing bugs - this could be the right job for you!
  • Plenty of experience with AWS, but we don't care as much about certifications as we do about your willingness to jump in and problem solve, and eagerness to learn as you go
  • Previous experience with SaaS products and running servers. Once you're up to speed on the team you'll join the on call rotation as well. 
  • Experience with at least one programming language, plus interest and excitement around writing code and fixing bugs throughout the work day
  • Comfort in handling the occasional IT administration task, including office systems and infrastructure
  • Experience using terraform and puppet
  • First hand knowledge and understanding of the growing pains of scaling a platform 10x or more
  • The ability to read javascript and some grasp of node.js
  • Familiarity with using Docker
  • Someone who can adapt, be flexible and learn new technologies quickly. We're moving quickly and have no intention of slowing down!
  • Front-End Developer
    New York City, NY
    We're looking for an experienced front-end developer to join our team and help us design, develop and implement the Glitch community experience from conception to completion. You'll be joining a team of creators who all bring varied experiences and design skills to their work as developers. If you've created an amazing, cool, interesting, or weird web app, then you're in the right place! 
  • Strong skills in client-side JavaScript, and you’re okay with diving into Node.js
  • An ability to stay up-to-date on modern additions to the language (like ES7 features)
  • Experience in building a11y-compliant interfaces right from the start
  • Proficiency in component-based development (Glitch uses React!)
  • Understanding of and experience in using APIs, both internal and 3rd party
  • CSS savvy, meaning that you know when to use border vs. padding vs. margin, flexbox vs. tables vs. floats, and responsive, fluid design
  • Experience working on a large application written in React in the past year (ie. can work with the latest in the React ecosystem)
  • User interface (UI) and/or user experience (UX) skills – enough to better collaborate with the designers on our team to meet our design & engineering goals.
  • Verified by
    You may also like
    E-Commerce at Scale: Inside Shopify's Tech Stack
    How SendGrid Scaled to 40 Billion Emails Per Month
    How Stream Built a Modern RSS Reader With JavaScript
    How Heap Built an Analytics Platform that Auto-Tracks Every User Event