Researcher/ Data Scientist



Background and Context

The home for more than 65 million developers in six continents, GitHub is the world’s largest code hosting platform. As such, metrics on GitHub platform usage can play a valuable role in research across many disciplines, including international development, public policy, and economics. Indeed, GitHub platform usage has been used to inform digital engagement, digital readiness, and tech maturity at regional (multi-country) and national levels. An example is the Portulans Institute’s Network Readiness Index 2020.

While GitHub publishes an annual snapshot of platform engagement through its State of the Octoverse reports, it does not currently maintain its own publicly available standardized list of platform usage metrics. Third-party platforms, such as GHTorrent, are used by researchers instead. While GHTorrent has been a valuable resource for the computer science industry, it is not designed for the aforementioned disciplines. Researchers have expressed the need for a smaller list of metrics that are updated at a regular cadence, disaggregated by country, and in a more accessible format such as .csv.

Scope of the Work

The GitHub Tech for Social Good and GitHub Policy teams are issuing this request for proposals (RFP) for background research and guidance on what a list of publicly available GitHub platform usage metrics by country should include and how frequently it should be updated. The metrics would be targeted to researchers in international development, public policy, and economics, but may be used in other disciplines. Example platform usage metrics may be the number and location of active entities, which include users, repositories and organizations. Metrics may also include pull requests, repository forks, and engagement on issues and discussions.

The scope of this RFP does not include the complete quantitative work to calculate the actual metrics, though the selected party may choose to include preliminary quantitative work in their outputs. The scope of this RFP does not include data from private GitHub repositories or GitHub Enterprise accounts.

Applying to and/or winning this RFP neither guarantees nor excludes any individual, company or organization from applying to and/or winning subsequent RFPs.

Proposed Timeline of Deliverables

Develop and Finalize Research Plan; Conduct Research (October – November 2021)

  • Through interviews and desk research, develop a comprehensive view of how researchers in the target disciplines currently use GitHub platform usage data
  • Investigate the types of GitHub platform usage data requests, gaps in third-party platforms, and un/met needs of the data requesters
  • Investigate the technical aspects of GitHub platform usage data requests. This may include the process researchers use in their data collection and relevant statistical calculations or adjustments they make.

Recommendations & Report (December 2021 – February 2022)

  • Provide detailed insights on researcher needs of GitHub platform metrics
  • Define a list of GitHub platform metrics to include for each country
  • Suggest possible GitHub platform metrics to capture diversity & inclusion
  • Recommend ameliorations for common edge cases, e.g. how calculations might change if a government blocks GitHub usage for an extended period of time
  • Advise on the period of time covered by the metrics calculations and how often new timepoints of the metrics should be added to remain current for researchers
  • Work with GitHub staff to understand the labor requirements for creating and maintaining GitHub platform usage metrics of this nature
  • Provide guidance for researchers to effectively use the metrics, including recommended proxy indicators and multipliers (e.g. for population), formats and dissemination methods (e.g. GitHub repo, external websites)
  • Produce a final report of the research, findings and guidance. GitHub may choose to publish all or part and lead regular meetings with the GitHub teams
  • Regularly and clearly communicate new and major findings, project blockers and what is needed and expected from the GitHub teams

Project Management (ongoing)

  • Set the agenda and lead regular meetings with the GitHub teams
  • Regularly and clearly communicate new and major findings, project blockers and what is needed and expected from the GitHub teams

Skills, Experience and Eligibility

Skills and Experience

  • A deep understanding of qualitative and quantitative research methods and research needs in international development, public policy or economics OR the ability to undercover those insights
  • Previous relevant experience creating, designing or informing indexes used in international development, public policy or economics research


  • Previous relevant experience using indexes to support research in international development, public policy or economics
  • Excellent qualitative research, research synthesis and writing skills
  • Excellent communication skills, especially in remote settings
  • Strong project management, interviewing, communication and facilitation skills
  • Familiarity with the open source ecosystem and GitHub platform a strong plus
  • Good understanding of how the needs of low- and middle-income countries, underrepresented groups in tech and marginalized communities should be accounted for in this work a strong plus
  • Professional proficiency in English is required. Other languages are encouraged but not required


  • Individual researchers, companies and organizations are eligible
  • Non-profit and for-profit companies and organizations are eligible
  • The successful candidate may be based anywhere in the world and must be available for meetings with GitHub staff located on the US East and US West coasts
  • A stable internet connection is required
  • People from low- and middle-income countries, (people who identify as) women, and people who identify as LGBTQI+ are strongly encouraged to apply


- Virtual
Posted on:
August 31, 2021

How to Apply

Proposal Requirements

1. High-level workplan, including methodologies, approach, and anticipated software

packages that will be used, if any

2. Previous examples of work in alignment with deliverables

3. 1-2 references

4. A budget proposal that does not exceed USD 25,000


Please send clarifying questions (if any) and proposals to

Mala Kumar, malakumar85@github.com

Cynthia Lo, csmlo@github.com

Peter Cihon, pcihon@github.com

Answers to clarifying questions will be posted on an ongoing basis in the “Engagement Opportunities” section on https://socialimpact.github.com. GitHub will notify anyone who sends in clarifying questions of updated answers.

NOTE: Microsoft is GitHub’s parent company. If selected for an interview, the individual, company or organization that applied must sign and submit a Microsoft NDA. NDAs are completed and stored online.