How Codacy Analyzes 30 Billion Lines of Code Per Day

2,317
Codacy
Codacy automates code reviews and monitors code quality on every commit and pull request. It reports back the impact of every commit or pull request in new issues concerning code style, best practices, security and many others. It monitors changes in code coverage, code duplication and code complexity. It allows developers to save time in code reviews and tackle efficiently technical debt.

Editor's Note: Jaime Jorge is co-founder and CEO at Codacy.

Codacy helps dev teams of all sizes to automate their code quality by identifying issues through static code analysis, both in the cloud and on-premise. The product notifies users about security issues, code coverage, code duplication and code complexity in every commit and pull request, directly from their current workflow. We sat down with Jaime to learn more about the technology behind Codacy's automated code review platform.

StackShare: Why did you and your other co-founder create Codacy?

Jaime Jorge: Being both developers, we started the company because we wanted to help developers focus on software development instead of just fixing code. I was researching this topic for my master's thesis (working with Telcos in Europe) to understand technical debt (in terms of code duplication), and Joao (my co-founder) was leading tech teams in the financial industry in the UK. What brought us together was the mission of helping as many developers and companies as we could to ship better code and increase their productivity.

Founded in 2012, Codacy now employs 40 people (more than half of which are technical) between our offices in Lisbon and NYC.

StackShare: Out of the 28 supported languages, which one do you see used the most on your platform?

JJ: The usage distribution of our supported programming languages follows what you’d expect to see looking at indexes/ranks like the one from TIOBE. The most used language in Codacy is Javascript. This is a result of a strong clustering of web development use cases. We then see Java, Python, Ruby and a few others close behind.

StackShare: It’s amazing how small your team is yet you support so many different languages.

JJ: When we started Codacy, we only supported Scala (on which our product is built). Following requests from new users over time, we started adding additional language support. We understood that modern development does not rely on one programming language alone, and modern tech stacks most often have a combination of many different languages. This forced us to create a platform that would make it easy for us to add new programming languages but also update their support. We also allowed for our users to bring their own support by exposing our integration mechanism.

StackShare: How do you use Codacy to build Codacy?

JJ: Our team uses Codacy every day, primarily to maintain the same criteria of development (formatting, coverage, best practices) across the different dev squads. There are features we use more often than others, which mirrors what we see from our customers.

StackShare: Which features do your team use most often?

JJ: Some team members like to use the dashboards to keep track of the main quality metrics, some like the build status we provide to make sure we’re within the defined criteria. All of the team uses the auto-comment feature, which helps our teams stay in-touch.

StackShare: What platforms do you integrate with?

JJ: Our most popular integrations are with GitHub, GitLab, Bitbucket, CircleCI, Jenkins, and Slack, although we support many others.

StackShare: How does Codacy provide notifications for security issues?

JJ: As part of our code analysis, we provide security notifications via the tools we integrate with.

StackShare: Tell us about your secure development practices?

JJ: We develop following security best practices and frameworks (OWASP Top 10, SANS Top 25). Our developers participate in regular security training to learn about common vulnerabilities and threats, and we review our code for security vulnerabilities. We also regularly update our dependencies and make sure none of them has known vulnerabilities.

Our teams use Static Application Security Testing (SAST) to detect basic security vulnerabilities in our codebase, and Dynamic Application Security Testing (DAST) to scan our applications.

StackShare: What’s the biggest issue new developers make when setting up an automated code review system?

JJ: Incorrect or incomplete configuration.

StackShare: How many automated code reviews do you process daily?

JJ: We pull about 8TB per day which, assuming 1 byte per character and 256 characters per line, we arrive at ~ 3*10^10 lines (about 30 billion). Interesting to note, this is about 40% of the text content in the Library of Congress (according to wolfram alpha)

StackShare: How do you store all of that data?

JJ: All of our services run in the cloud on AWS. We don’t host or run our own routers, load balancers, DNS servers, or physical servers.

StackShare: What AWS services do you use specifically for getting that data processed, indexed and stored?

JJ: Data is processed using EC2 instances. We currently run our applications using Docker on Elastic Beanstalk, but we are transitioning to EKS. The data is stored on RDS, where we use both Aurora and Postgres. Although the volume of data we pull to analyze is 8TB, the analysis results (that we actually store) are significantly smaller. You don’t need the code verbatim for every source file - you just store the issues and where in the file you found them. We then leverage AWS to scale elastically (e.g. the number of active analysis servers) with the current load.

StackShare: Does this process still involve Scala or another language?

JJ: Our applications are all implemented in Scala. They do all the heavy lifting regarding data processing/indexing.

StackShare: How long do you retain that data?

JJ: The repositories are cloned, analyzed and then deleted.


Thanks for reading! If you use Codacy you can add them to your stack here.

Codacy
Codacy automates code reviews and monitors code quality on every commit and pull request. It reports back the impact of every commit or pull request in new issues concerning code style, best practices, security and many others. It monitors changes in code coverage, code duplication and code complexity. It allows developers to save time in code reviews and tackle efficiently technical debt.
Tools mentioned in article
Open jobs at Codacy
Technical Support Engineer

Codacy is a Lisbon based DevOps Intelligence Platform that helps thousands of developers ship billions of lines of code per day by automating and standardizing code reviews. Our mission is to help software development teams make great engineering decisions and create productivity through quality.


We are a small team of highly dedicated and ambitious people. We are curious, funny, radically honest yet kind, and we thrive on collaboration and transparency. Our main focus is on creating value for our customers.

We are looking for a Second Line Support Engineer who will play a key role in developing and maintaining a strong customer perception of support quality, bringing customer and support feedback into the product, and troubleshooting/solving bugs and complex issues.

Codacy users reach out to the Support team for all general and product-related questions. You will help troubleshoot and solve technical issues and ensure that the team is always able to drive a smooth adoption of Codacy.
 

What will be your day-to-day?

  • Replicate, troubleshoot and solve technical issues together with customer-facing teams;
  • You will be debugging our product, finding issues on our codebase, and solving them with the help of our engineering team;
  • Escalate to product and engineering teams and collaborate in triaging, prioritizing new features;
  • Identify recurring technical issues and collaborate with the team on improvements to the internal documentation as well as the customer-facing documentation;
  • Help team members stay up to date on product knowledge and answer their technical questions;
  • Suggest and implement process improvements to improve the support workflow.

What skills are needed? 

  • Ability to debug, triage, and solve technical issues and summarize all the steps along the way;
  • Experience solving technical problems;
  • Experience with at least one Object-Oriented Programming language and the ability to understand and debug code;
  • Experience with Git;
  • Knowledge of SQL;
  • Knowledge with unix command line desirable;
  • Ability to communicate complex technical topics simply and clearly in written and spoken English with the other teams;
  • Ability to maintain clear, concise, and positive communication for all cases in a timely and efficient manner including follow-ups with team members and engineers;
  • Experience with development and deployment workflows 
  • Willingness to teach and learn;
  • Creative thinking/problem-solving;
  • Ability to provide ideas and assist in the creation of documentation and training material for external and internal support content.

What else makes working at Codacy great?

  • Competitive Salary. Check our salary calculator at https://www.codacy.com/careers
  • Comprehensive health insurance for household members, with dental and vision;
  • Generous learning and development budget;
  • Flexible holidays;
  • Flexible working hours;
  • A remote first work policy
     
Frontend Software Engineer

Our vision is to enable everyone to craft software with confidence while focusing on impacting the world at the speed of thought. Our DevOps Intelligence Platform includes two products that enable software development teams to achieve their full potential and give management teams visibility on their investment:

  • Codacy: provides software analysis to help developers quantify and act on their software quality, engineering performance, and security
  • Pulse: measures engineering health and performance so teams can continually improve with data-driven insights

We're curious, funny, radically honest yet kind, and we thrive on collaboration and transparency. We're a team of highly dedicated and ambitious domain experts brought together by the mission to help development teams reach their full potential, and driven by having a worldwide impact on software development. 

We are looking for a Frontend Software Engineer, who will help Codacy create a new web application in React.js.

What will be your day-to-day?

  • Contribute to the frontend engineering direction of our core products and components;
  • Learn from and mentor fellow engineers and help them grow their technical knowledge;
  • Define, implement and support a single page web application;
  • Collaborate with other teams in order to improve our application’s overall architecture;
  • Improve the application lifecycle, from deployment to real-time monitoring processes.

 

What are the skills and experience needed to do the job successfully?

  • 4 years of experience with at least one Object-Oriented Programming language and the ability to write efficient, maintainable code;
  • Experience in JavaScript and/or TypeScript;
  • Experience developing modular SPAs and PWAs with React.js;
  • Experience with web infrastructure and distributed systems;
  • Familiarity with data structures and fundamentals of algorithm design;
  • Ability to solve practical problems and deal with a variety of concrete variables;
  • Design, communicate, and implement solutions effectively;
  • Experience with API consumption and Feature Development;
  • Knowledge and experience with Git and Git Workflows;
  • Experience with SOA, micro-services or RESTful design patterns;
  • AWS experience is desirable but not a requisite;
  • Strong knowledge of English, written and spoken.

 

Who will you be working closely with?

  • A close tight knit team of passionate like minded software engineers; 
  • Product managers and tech leaders to implement solutions that help software engineers worldwide write better code.

 

What else makes working at Codacy awesome?

  • Competitive Salary. Check our salary calculator at https://www.codacy.com/careers
  • Comprehensive health insurance for household members, with dental and vision;
  • Generous learning and development budget;
  • Flexible holidays;
  • Flexible working hours;
  • Remote work.
Backend Software Engineer

Our vision is to enable everyone to craft software with confidence while focusing on impacting the world at the speed of thought. Our DevOps Intelligence Platform includes two products that enable software development teams to achieve their full potential and give management teams visibility on their investment:

  • Codacy: provides software analysis to help developers quantify and act on their software quality, engineering performance, and security
  • Pulse: measures engineering health and performance so teams can continually improve with data-driven insights

We're curious, funny, radically honest yet kind, and we thrive on collaboration and transparency. We're a team of highly dedicated and ambitious domain experts brought together by the mission to help development teams reach their full potential, and driven by having a worldwide impact on software development. 

We are looking for a Backend Software Engineer who will help Codacy develop core product features by using the right technologies and collaborating with the Product and Tech teams.
 

What will be your day-to-day?

  • Be part of one of our development squads to become one of our top developers;
  • Develop our core products and components;
  • Collaborate with other teams in order to improve the overall architecture of our application;
  • Define, implement and support a distributed application;
  • Improve the application lifecycle, from deployment to real-time monitoring processes. 
     

What are the skills and experience needed to do the job successfully?

  • 3+ years of experience with at least one Object-Oriented Programming language and the ability to write efficient, maintainable code;
  • Wanting to learn to develop and support data-driven applications in Scala or Go;
  • Experience with concurrent users application development;
  • Knowledge of SQL and also be familiar with other data storages;
  • Familiarity with data structures and fundamentals of algorithm design;
  • Ability to solve practical problems and deal with a variety of concrete variables;
  • Design, communicate, and implement solutions effectively;
  • API development and Feature Development effectively;
  • Knowledge and experience with Git and Git Workflows;
  • Docker, Kubernetes, Infrastructure as Code and AWS experience is desirable but not a requisite;
  • Bachelor's degree in Computer Science, Computer Information Systems or closely related field is a plus;
  • Strong knowledge of English. 
     

Who will you be working closely with?

  • A close tight knit team of passionate like minded software engineers. 
     

What else makes working at Codacy great?

  • Competitive Salary. Check our salary calculator at https://www.codacy.com/careers
  • Comprehensive health insurance for household members, with dental and vision;
  • Generous learning and development budget;
  • Flexible holidays;
  • Flexible working hours;
  • Remote work.
Site Reliability Engineer

Our vision is to enable everyone to craft software with confidence while focusing on impacting the world at the speed of thought. Our DevOps Intelligence Platform includes two products that enable software development teams to achieve their full potential and give management teams visibility on their investment:

  • Codacy: provides software analysis to help developers quantify and act on their software quality, engineering performance, and security
  • Pulse: measures engineering health and performance so teams can continually improve with data-driven insights

We're curious, funny, radically honest yet kind, and we thrive on collaboration and transparency. We're a team of highly dedicated and ambitious domain experts brought together by the mission to help development teams reach their full potential, and driven by having a worldwide impact on software development. 

We are looking for a Site Reliability Engineer to join our Product Team.

What will be your day-to-day?

  • Monitoring: contribute to the improvement of the monitoring and measurement systems that support our operational scale and continuous delivery. This goes from setting up and maintaining the right tools, to help the different engineering teams on the correct instrumentation of their code;
  • Availability: work to measure and increase the mean-time-between-failures and decrease the mean-time-to-repair of public-facing systems;
  • Operations: help the engineering team to operate their systems;
  • Performance, Efficiency & Latency: contribute to the measurement techniques that assist in the performance tuning of the applications stack, use the monitoring systems to help maintain application performance at acceptable levels, and recommend and implement performance improvements across the stack;
  • Security & Risk: participate in the ongoing process to identify and mitigate risk in our systems;
  • Capacity Planning: use our monitoring to advise on capacity requirements;
  • Engineering Tools: create and maintain tools that can help engineering teams improve their day to day work.

What are the skills and experience needed to do the job successfully?

  • Docker;
  • Datadog, APM , Grafana,  Prometheus, Cloudwatch - or similar;
  • Application development experience with at least one programming language (Java, Scala, Go, python...);
  • Experience managing systems with daily deployments that have to handle millions of requests;
  • An understanding that managing systems at scale require end to end infrastructure tools and automation;
  • Broad knowledge of system administration, networking, databases, security, storage and performance and have expertise in at least one of these disciplines;
  • Experience aligning with the goals of the DevOps movement in the sense that teams own the full cycle of the development process from design to operation;
  • Has provided a positive contribution to both operations-focused and development-focused work;
  • Has built and maintained cloud-based applications and infrastructure;
  • Has worked with tools and frameworks for automating infrastructure;
  • Passion for and experience in best practices in systems operations tools and techniques.

What else makes working at Codacy awesome?

  • Competitive Salary. Check our salary calculator at https://www.codacy.com/careers; 
  • Comprehensive health insurance for household members, with dental and vision;
  • Generous learning and development budget;
  • Flexible holidays;
  • Flexible working hours;
  • A remote first work policy (work from anywhere!)
     
Verified by
You may also like