How Codacy Analyzes 30 Billion Lines of Code Per Day

1,549
Codacy
Codacy automates code reviews and monitors code quality on every commit and pull request. It reports back the impact of every commit or pull request in new issues concerning code style, best practices, security and many others. It monitors changes in code coverage, code duplication and code complexity. It allows developers to save time in code reviews and tackle efficiently technical debt.

Editor's Note: Jaime Jorge is co-founder and CEO at Codacy.

Codacy helps dev teams of all sizes to automate their code quality by identifying issues through static code analysis, both in the cloud and on-premise. The product notifies users about security issues, code coverage, code duplication and code complexity in every commit and pull request, directly from their current workflow. We sat down with Jaime to learn more about the technology behind Codacy's automated code review platform.

StackShare: Why did you and your other co-founder create Codacy?

Jaime Jorge: Being both developers, we started the company because we wanted to help developers focus on software development instead of just fixing code. I was researching this topic for my master's thesis (working with Telcos in Europe) to understand technical debt (in terms of code duplication), and Joao (my co-founder) was leading tech teams in the financial industry in the UK. What brought us together was the mission of helping as many developers and companies as we could to ship better code and increase their productivity.

Founded in 2012, Codacy now employs 40 people (more than half of which are technical) between our offices in Lisbon and NYC.

StackShare: Out of the 28 supported languages, which one do you see used the most on your platform?

JJ: The usage distribution of our supported programming languages follows what you’d expect to see looking at indexes/ranks like the one from TIOBE. The most used language in Codacy is Javascript. This is a result of a strong clustering of web development use cases. We then see Java, Python, Ruby and a few others close behind.

StackShare: It’s amazing how small your team is yet you support so many different languages.

JJ: When we started Codacy, we only supported Scala (on which our product is built). Following requests from new users over time, we started adding additional language support. We understood that modern development does not rely on one programming language alone, and modern tech stacks most often have a combination of many different languages. This forced us to create a platform that would make it easy for us to add new programming languages but also update their support. We also allowed for our users to bring their own support by exposing our integration mechanism.

StackShare: How do you use Codacy to build Codacy?

JJ: Our team uses Codacy every day, primarily to maintain the same criteria of development (formatting, coverage, best practices) across the different dev squads. There are features we use more often than others, which mirrors what we see from our customers.

StackShare: Which features do your team use most often?

JJ: Some team members like to use the dashboards to keep track of the main quality metrics, some like the build status we provide to make sure we’re within the defined criteria. All of the team uses the auto-comment feature, which helps our teams stay in-touch.

StackShare: What platforms do you integrate with?

JJ: Our most popular integrations are with GitHub, GitLab, Bitbucket, CircleCI, Jenkins, and Slack, although we support many others.

StackShare: How does Codacy provide notifications for security issues?

JJ: As part of our code analysis, we provide security notifications via the tools we integrate with.

StackShare: Tell us about your secure development practices?

JJ: We develop following security best practices and frameworks (OWASP Top 10, SANS Top 25). Our developers participate in regular security training to learn about common vulnerabilities and threats, and we review our code for security vulnerabilities. We also regularly update our dependencies and make sure none of them has known vulnerabilities.

Our teams use Static Application Security Testing (SAST) to detect basic security vulnerabilities in our codebase, and Dynamic Application Security Testing (DAST) to scan our applications.

StackShare: What’s the biggest issue new developers make when setting up an automated code review system?

JJ: Incorrect or incomplete configuration.

StackShare: How many automated code reviews do you process daily?

JJ: We pull about 8TB per day which, assuming 1 byte per character and 256 characters per line, we arrive at ~ 3*10^10 lines (about 30 billion). Interesting to note, this is about 40% of the text content in the Library of Congress (according to wolfram alpha)

StackShare: How do you store all of that data?

JJ: All of our services run in the cloud on AWS. We don’t host or run our own routers, load balancers, DNS servers, or physical servers.

StackShare: What AWS services do you use specifically for getting that data processed, indexed and stored?

JJ: Data is processed using EC2 instances. We currently run our applications using Docker on Elastic Beanstalk, but we are transitioning to EKS. The data is stored on RDS, where we use both Aurora and Postgres. Although the volume of data we pull to analyze is 8TB, the analysis results (that we actually store) are significantly smaller. You don’t need the code verbatim for every source file - you just store the issues and where in the file you found them. We then leverage AWS to scale elastically (e.g. the number of active analysis servers) with the current load.

StackShare: Does this process still involve Scala or another language?

JJ: Our applications are all implemented in Scala. They do all the heavy lifting regarding data processing/indexing.

StackShare: How long do you retain that data?

JJ: The repositories are cloned, analyzed and then deleted.


Thanks for reading! If you use Codacy you can add them to your stack here.

Codacy
Codacy automates code reviews and monitors code quality on every commit and pull request. It reports back the impact of every commit or pull request in new issues concerning code style, best practices, security and many others. It monitors changes in code coverage, code duplication and code complexity. It allows developers to save time in code reviews and tackle efficiently technical debt.
Tools mentioned in article
Open jobs at Codacy
Software Developer

Codacy is the leading code quality cloud platform helping thousands of developers ship billions of lines of code per day. Our mission is to help software development teams make great engineering decisions and create productivity through quality. We love crafting software and we're obsessed about helping developers and teams have better code.

As with any component of our product, every person takes relevant ownership and is expected to take decisions autonomously, and so will you. We're a small team of highly dedicated people who get things done quickly. We'd love to have your opinion and hear your thoughts.


Responsibilities:

  • Be part of one of our development teams and to be one of our top developers in the company
  • Develop our core products and components
  • Collaborate with other teams in order to improve the overall architecture of our application
  • Define, implement and support a distributed application
  • Improve the application lifecycle, from deployment to real-time monitoring processes

Requirements

  • 2+ years of experience with at least one Object-Oriented Programming language and the ability to write efficient, maintainable code
  • Wanting to learn to develop and support data-driven applications in Scala
  • Experience with concurrent users application development
  • Knowledge of SQL and also be familiar with other data structures
  • Familiarity with data structures and fundamentals of algorithm design
  • Ability to solve practical problems and deal with a variety of concrete variables
  • Design, communicate, and implement solutions effectively
  • API development and Feature Development experience
  • Knowledge and experience with Git and Git Workflows
  • Experience with SOA, micro-services and/or RESTful design patterns
  • AWS experience is desirable but not a requisite
  • Bachelor’s degree in Computer Science, Computer Information System or closely related field
  • Strong knowledge of English

Benefits

  • Challenging and awesome opportunity
  • High impact work with a small team of amazing people
  • Competitive Salary
  • Job based in Lisbon, Portugal
  • Flexible hour schedule
  • Paid Time Off
  • Work From Home
  • Performance Bonus
  • Training & Development
  • Complimentary office snacks and drinks
DevOps

Codacy is the leading code quality cloud platform helping thousands of developers ship billions of lines of code per day. Our mission is to help software development teams make great engineering decisions and create productivity through quality. We love crafting software and we're obsessed about helping developers and teams have better code.


As with any component of our product, every person takes relevant ownership and is expected to take decisions autonomously, and so will you. We're a small team of highly dedicated people who get things done quickly. We'd love to have your opinion and hear your thoughts.


Responsibilities

The candidate will be an ascending DevOps engineer seeking the summit of software development and deployment, and we need you to take us to the top. We need a DevOps engineer with strong cloud services experience (AWS preferred), to improve on the services that support our custom Scala application. You will be responsible for maintaining or redefining our whole stack - ELBs to databases - transforming dev, testing and production environments. In this role, you’ll work collaboratively with software engineering to deploy and operate our systems. Help automate and streamline our operations and processes. Build and maintain tools for deployment, monitoring and operations, troubleshoot and resolve issues in our dev, test and production environments.


Requirements

  • Strong background in Linux/Unix Administration
    • Docker experience is a big plus
  • Experience with automation/configuration management using either Puppet, Chef or an equivalent
  • Ability to use a wide variety of open source technologies and cloud services (experience with AWS is a big plus)
  • Experience with at least one major cloud service providers (ex: AWS, Google Cloud, Azure, ...)
  • Strong experience in database maintenance and optimization (SQL and Postgres, NoSQL experience is a plus, too)
  • A working understanding of code and script (Bash, Ruby, Python, or equivalent)
  • Knowledge of best practices and IT operations in an always-up, always-available service
  • Experience with continuous delivery techniques.

Benefits

  • Challenging and awesome opportunity
  • High impact work with a small team of amazing people
  • Competitive Salary
  • Job based in Lisbon, Portugal
  • Flexible hour schedule
  • Paid Time Off
  • Work From Home
  • Performance Bonus
  • Training & Development
  • Complimentary office snacks and drinks
Senior Software Developer (Team Leader)

Codacy is the leading code quality cloud platform helping thousands of developers ship billions of lines of code per day. Our mission is to help software development teams make great engineering decisions and create productivity through quality. We love crafting software and we're obsessed about helping developers and teams have better code.

As with any component of our product, every person takes relevant ownership and is expected to take decisions autonomously, and so will you. We're a small team of highly dedicated people who get things done quickly. We'd love to have your opinion and hear your thoughts.

As a team leader, your responsibility will be to build a new team, focused on improving Codacy Enterprise offering. You will be managing everything Enterprise, from development & deployment process to supporting current customers with their current infrastructure. You will guide and ease the life of our users by implementing the next generation of Codacy enterprise. Our focus every day is being the clearest and most useful product, something you will help us continue to achieve.


Responsibilities:

  • Be the leader of one of our development teams
  • Improve the design of the enterprise application architecture and deployment process
  • Enhance the development our core products and components
  • Collaborate with other teams in order to improve the overall architecture of our application
  • Define, implement and support a distributed application
  • Improve the application lifecycle, from deployment to real-time monitoring processes

Requirements

  • 4+ years of experience with at least one Object-Oriented Programming language and the ability to write efficient, maintainable code
  • Able to lead a team a team and successfully design a infrastructure to manage thousands of users concurrently
  • Experience with concurrent users application development
  • Ability to solve practical problems and deal with a variety of concrete variables
  • Design, communicate, and implement solutions effectively
  • API development and Feature Development experience
  • Experience with SOA, docker, micro-services and/or RESTful design patterns
  • AWS experience is desirable
  • Strong knowledge of English

Benefits

  • Challenging and awesome opportunity
  • High impact work with a small team of amazing people
  • Competitive Salary
  • Job based in Lisbon, Portugal
  • Flexible hour schedule
  • Paid Time Off
  • Work From Home
  • Performance Bonus
  • Training & Development
  • Complimentary office snacks and drinks
Senior Software Developer

Codacy is the leading code quality cloud platform helping thousands of developers ship billions of lines of code per day. Our mission is to help software development teams make great engineering decisions and create productivity through quality. We love crafting software and we're obsessed about helping developers and teams have better code.

As with any component of our product, every person takes relevant ownership and is expected to take decisions autonomously, and so will you. We're a small team of highly dedicated people who get things done quickly. We'd love to have your opinion and hear your thoughts.


Responsibilities:

  • Be part of one of our development teams and to be one of our top developers in the company
  • Develop our core products and components
  • Collaborate with other teams in order to improve the overall architecture of our application
  • Define, implement and support a distributed application
  • Improve the application lifecycle, from deployment to real-time monitoring processes

Requirements

  • 2+ years of experience with at least one Object-Oriented Programming language and the ability to write efficient, maintainable code
  • Wanting to learn to develop and support data-driven applications in Scala
  • Experience with concurrent users application development
  • Knowledge of SQL and also be familiar with other data structures
  • Familiarity with data structures and fundamentals of algorithm design
  • Ability to solve practical problems and deal with a variety of concrete variables
  • Design, communicate, and implement solutions effectively
  • API development and Feature Development experience
  • Knowledge and experience with Git and Git Workflows
  • Experience with SOA, micro-services and/or RESTful design patterns
  • AWS experience is desirable but not a requisite
  • Bachelor’s degree in Computer Science, Computer Information System or closely related field
  • Strong knowledge of English

Benefits

  • Challenging and awesome opportunity
  • High impact work with a small team of amazing people
  • Competitive Salary
  • Job based in Lisbon, Portugal
  • Flexible hour schedule
  • Paid Time Off
  • Work From Home
  • Performance Bonus
  • Training & Development
  • Complimentary office snacks and drinks
Verified by
Developer Advocate
You may also like