How Codacy Analyzes 30 Billion Lines of Code Per Day

1,845
Codacy
Codacy automates code reviews and monitors code quality on every commit and pull request. It reports back the impact of every commit or pull request in new issues concerning code style, best practices, security and many others. It monitors changes in code coverage, code duplication and code complexity. It allows developers to save time in code reviews and tackle efficiently technical debt.

Editor's Note: Jaime Jorge is co-founder and CEO at Codacy.

Codacy helps dev teams of all sizes to automate their code quality by identifying issues through static code analysis, both in the cloud and on-premise. The product notifies users about security issues, code coverage, code duplication and code complexity in every commit and pull request, directly from their current workflow. We sat down with Jaime to learn more about the technology behind Codacy's automated code review platform.

StackShare: Why did you and your other co-founder create Codacy?

Jaime Jorge: Being both developers, we started the company because we wanted to help developers focus on software development instead of just fixing code. I was researching this topic for my master's thesis (working with Telcos in Europe) to understand technical debt (in terms of code duplication), and Joao (my co-founder) was leading tech teams in the financial industry in the UK. What brought us together was the mission of helping as many developers and companies as we could to ship better code and increase their productivity.

Founded in 2012, Codacy now employs 40 people (more than half of which are technical) between our offices in Lisbon and NYC.

StackShare: Out of the 28 supported languages, which one do you see used the most on your platform?

JJ: The usage distribution of our supported programming languages follows what you’d expect to see looking at indexes/ranks like the one from TIOBE. The most used language in Codacy is Javascript. This is a result of a strong clustering of web development use cases. We then see Java, Python, Ruby and a few others close behind.

StackShare: It’s amazing how small your team is yet you support so many different languages.

JJ: When we started Codacy, we only supported Scala (on which our product is built). Following requests from new users over time, we started adding additional language support. We understood that modern development does not rely on one programming language alone, and modern tech stacks most often have a combination of many different languages. This forced us to create a platform that would make it easy for us to add new programming languages but also update their support. We also allowed for our users to bring their own support by exposing our integration mechanism.

StackShare: How do you use Codacy to build Codacy?

JJ: Our team uses Codacy every day, primarily to maintain the same criteria of development (formatting, coverage, best practices) across the different dev squads. There are features we use more often than others, which mirrors what we see from our customers.

StackShare: Which features do your team use most often?

JJ: Some team members like to use the dashboards to keep track of the main quality metrics, some like the build status we provide to make sure we’re within the defined criteria. All of the team uses the auto-comment feature, which helps our teams stay in-touch.

StackShare: What platforms do you integrate with?

JJ: Our most popular integrations are with GitHub, GitLab, Bitbucket, CircleCI, Jenkins, and Slack, although we support many others.

StackShare: How does Codacy provide notifications for security issues?

JJ: As part of our code analysis, we provide security notifications via the tools we integrate with.

StackShare: Tell us about your secure development practices?

JJ: We develop following security best practices and frameworks (OWASP Top 10, SANS Top 25). Our developers participate in regular security training to learn about common vulnerabilities and threats, and we review our code for security vulnerabilities. We also regularly update our dependencies and make sure none of them has known vulnerabilities.

Our teams use Static Application Security Testing (SAST) to detect basic security vulnerabilities in our codebase, and Dynamic Application Security Testing (DAST) to scan our applications.

StackShare: What’s the biggest issue new developers make when setting up an automated code review system?

JJ: Incorrect or incomplete configuration.

StackShare: How many automated code reviews do you process daily?

JJ: We pull about 8TB per day which, assuming 1 byte per character and 256 characters per line, we arrive at ~ 3*10^10 lines (about 30 billion). Interesting to note, this is about 40% of the text content in the Library of Congress (according to wolfram alpha)

StackShare: How do you store all of that data?

JJ: All of our services run in the cloud on AWS. We don’t host or run our own routers, load balancers, DNS servers, or physical servers.

StackShare: What AWS services do you use specifically for getting that data processed, indexed and stored?

JJ: Data is processed using EC2 instances. We currently run our applications using Docker on Elastic Beanstalk, but we are transitioning to EKS. The data is stored on RDS, where we use both Aurora and Postgres. Although the volume of data we pull to analyze is 8TB, the analysis results (that we actually store) are significantly smaller. You don’t need the code verbatim for every source file - you just store the issues and where in the file you found them. We then leverage AWS to scale elastically (e.g. the number of active analysis servers) with the current load.

StackShare: Does this process still involve Scala or another language?

JJ: Our applications are all implemented in Scala. They do all the heavy lifting regarding data processing/indexing.

StackShare: How long do you retain that data?

JJ: The repositories are cloned, analyzed and then deleted.


Thanks for reading! If you use Codacy you can add them to your stack here.

Codacy
Codacy automates code reviews and monitors code quality on every commit and pull request. It reports back the impact of every commit or pull request in new issues concerning code style, best practices, security and many others. It monitors changes in code coverage, code duplication and code complexity. It allows developers to save time in code reviews and tackle efficiently technical debt.
Tools mentioned in article
Open jobs at Codacy
Senior Software Engineer (Tech Team)
Lisbon
Codacy builds the leading code quality platform that helps thousands of developers ship billions of lines of code per day. Our mission is to help software development teams make great engineering decisions and create productivity through quality. We are a small team of highly dedicated and ambitious people. We are curious, funny, radically honest yet kind, and we thrive on collaboration and transparency. Our main focus is on creating value for our customers. Whether you’re skilled in building, selling, marketing or supporting, we want you to help us change the developer tools industry. We are looking for a Senior Software Engineer, who will help Codacy to develop Product features using the right technologies by ensuring that Engineering Teams are focused on feature development and not on tech.
  • Plan evolution of the architecture;
  • Investigate technologies to solve existing problems and evolve the product;
  • Speed up the teams by addressing the tech debt as needed;
  • Create and maintain tools which automate and ease the development process;
  • Develop and support technical solutions to improve quality and proper evolution of the technical stack;
  • Create, organize and detail tech related tasks;
  • Collaborate with other teams in order to improve the overall architecture of our application;
  • Improve the application lifecycle, from deployment to real-time monitoring processes;
  • Providing technical leadership and support to software development teams.
  • 5/6 years of strong experience with a programming language and the ability to write efficient, maintainable code;
  • Wanting to learn to develop and support data-driven applications in Scala;
  • Experience with concurrent users application development;
  • Familiarity with data structures and fundamentals of algorithm design;
  • Ability to solve practical problems and deal with a variety of concrete variables;
  • Design, communicate, and implement solutions effectively;
  • Knowledge and experience with Git and Git Workflows;
  • Experience with SOA, micro-services and/or RESTful design patterns;
  • Excellent communication skills and ability to translate complex requirements into functional architecture;
  • Hands-on experience on software development and ability to manage complex programs;
  • AWS experience is desirable but not a requisite;
  • Excellent communication skills and strong knowledge of English;
  • Problem-solving aptitude.
  • Tech Leads team to organize and refine tech related vision and improvements;
  • One of our product development teams to which you will provide guidance.
  • Competitive Salary. Check our our salary calculator at https://www.codacy.com/careers;
  • Comprehensive health insurance for household members, with dental and vision;
  • Snacks & Drinks in the office everyday;
  • Regular compensation reviews;
  • Generous learning and development budget; 
  • Pet-friendly offices;
  • Flexible holidays;
  • Flexible working hours;
  • Remote work;
  • Regular team gatherings.
  • Junior Backend Engineer
    Lisbon
    Codacy builds the leading code quality platform that helps thousands of developers ship billions of lines of code per day. Our mission is to help software development teams make great engineering decisions and create productivity through quality. We are a small team of highly dedicated and ambitious people. We are curious, funny, radically honest yet kind, and we thrive on collaboration and transparency. Our main focus is on creating value for our customers. Whether you’re skilled in building, selling, marketing or supporting, we want you to help us change the developer tools industry. We are looking for a Junior Software Engineer, who will help Codacy create and maintain product features by working directly with one of our product squads
  • Hangout with some of Portugal’s finest tech professionals
  • Learn the tricks of the trade to kickstart your career
  • Working with industry defining trends
  • Bring your awesome self to a diverse multicultural team
  • Help create exciting tools used by fellow engineers worldwide
  • Wanting to learn to develop and support data-driven applications in Scala
  • Knowledge of SQL and also be familiar with other data structures
  • Familiarity with data structures and fundamentals of algorithm design
  • Ability to solve practical problems and deal with a variety of concrete variables
  • Design, communicate, and implement solutions effectively
  • Bachelor’s degree in Computer Science, Computer Information System or closely related field
  • Strong knowledge of English
  • A close tight knit team of passionate like minded software engineers 
  • Product managers and tech leaders to implement solutions that help software engineers worldwide write better code
  • Competitive Salary. Check our our salary calculator at https://www.codacy.com/careers
  • Comprehensive health insurance for household members, with dental and vision 
  • Snacks & Drinks in the office everyday
  • Regular compensation reviews
  • Generous learning and development budget 
  • Pet-friendly offices 
  • Flexible holidays
  • Flexible working hours 
  • Remote work
  • Regular team gatherings
  • To apply send us your CV or LinkedIn page and a small cover letter (mandatory): telling us a little bit about you and how your background or interests are related to Codacy and an entrepreneurship mindset ( you can share with us entrepreneurship projects that you developed, or that you are a fan of; OSS Projects you have contributed to, etc...).
  • QA Engineer (React)
    Lisbon/
    Codacy builds the leading code quality platform that helps thousands of developers ship billions of lines of code per day. Our mission is to help software development teams make great engineering decisions and create productivity through quality. We are a small team of highly dedicated and ambitious people. We are curious, funny, radically honest yet kind, and we thrive on collaboration and transparency. Our main focus is on creating value for our customers. Whether you’re skilled in building, selling, marketing or supporting, we want you to help us change the developer tools industry. We’re searching for QA Engineer with knowledge in Frontend Automation, to come be a key part of a high-performing and motivated team. Our goal is to have a positive culture where engineering teams deliver reliably to any environment. 
  • Work with the development and test product teams to build automated test suites for our product
  • Design and implement automated testing tools to improve the software testing process
  • Define together with the product teams, the critical features to be manually tested
  • Be the Quality driver within Product team
  • Manage defects ensuring they are correctly captured, reported, tracked and retested
  • Investigate and identify the source of the defects, alert the team and ensure nothing critical goes to Production
  • Participate in the daily, planning and retrospectives of product teams
  • 3+ Years of Experience working as QA Engineer
  • Good object oriented language programming skills
  • Critical thinking and passion for testing and quality
  • Experience with Automation testing frameworks like "Selenium Webdriver" 
  • Knowledge of automation testing of React applications 
  • Familiar with test runners like Jest or enzyme
  • Knowledge of how web applications work including usability and accessibility testing
  • Demonstrable experience with analysis of stories, creation of test cases and exploratory testing
  • Knowledge and experience with Git and Git Workflows
  • Good knowledge of English
  • Easy going and fun to work with
  • You will be part of the Reliability team but work closely with other product teams helping them improve quality.
  • Competitive Salary. Check our our salary calculator at https://www.codacy.com/careers
  • Comprehensive health insurance for household members, with dental and vision
  • Snacks & Drinks in the office everyday
  • Regular compensation reviews
  • Generous learning and development budget 
  • Pet-friendly offices 
  • Flexible holidays
  • Flexible working hours 
  • Remote work
  • Regular team gatherings
  • You may also like