Analyze GitHub Profiles Using Olostep API and GPT-4 in Streamlit

Analyze GitHub Profiles Using Olostep API and GPT-4 in Streamlit

Ehsan

Ehsan

Growth Engineer, Olostep

Analyze GitHub Profiles Using Olostep API and GPT-4 in Streamlit

Want to extract meaningful insights from a developer’s GitHub profile automatically? In this guide, we’ll walk through how to build a Streamlit web app that:

  • Scrapes GitHub profile content using Olostep’s Scrape API
  • Uses OpenAI GPT-4 to analyze and summarize profile insights
  • Displays the results neatly using Streamlit

Objectives

  • Scrape public GitHub profile using Olostep API
  • Generate analysis using OpenAI GPT-4
  • Build an intuitive UI with Streamlit
  • Present insights like skills, contributions, and collaboration

Requirements

Install the following Python packages:

pip install streamlit openai requests python-dotenv

Create a .env file in your root directory with your API credentials:

OLOSTEP_API_KEY=your_olostep_api_key
OPENAI_API_KEY=your_openai_api_key

Project Structure

github-analyzer/
├── app.py
├── .env
└── requirements.txt

Source code

Step 1: Import Libraries and Load Environment Variables

We start by importing necessary modules and loading API keys from the environment.

import os
import requests
import streamlit as st
from dotenv import load_dotenv
from openai import OpenAI

load_dotenv()
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
olostep_api_key = os.getenv("OLOSTEP_API_KEY")

Step 2: Scrape GitHub Profile using Olostep

Use Olostep's scrape endpoint to retrieve the markdown content of a GitHub profile.

def scrape_profile(username):
    headers = {
        "Authorization": f"Bearer {os.getenv('OLOSTEP_API_KEY')}",
        "Content-Type": "application/json"
    }
    payload = {
        "formats": ["markdown"],
        "country": "US",
        "url_to_scrape": f"https://github.com/{username}"
    }
    res = requests.post("https://api.olostep.com/v1/scrapes", headers=headers, json=payload)
    return res.json().get("result", {}).get("markdown_content", "")

Step 3: Analyze Markdown Content with GPT-4

Send the scraped markdown to OpenAI to generate a detailed GitHub profile analysis.

def analyze_with_gpt(username, markdown):
    prompt = f"""
    Analyze the GitHub profile below for insights:
    - Professional background (company, location, role)
    - Activity (repos, stars, streaks)
    - Tech stack
    - Community engagement

    Username: {username}

    Markdown Content:
    {markdown}
    """
    response = client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": prompt}],
    )
    return response.choices[0].message.content.strip()

Step 4: Build Streamlit UI

Construct a simple web UI with Streamlit to input a username and display results.

st.set_page_config(page_title="GitHub Analyzer", page_icon="🐙")
st.title("🐙 GitHub Profile Analyzer")
username = st.text_input("Enter a GitHub username (e.g., torvalds)")

if st.button("Analyze Profile"):
    with st.spinner("Scraping GitHub profile..."):
        md = scrape_profile(username)
    if not md:
        st.error("Could not scrape the profile.")
    else:
        with st.spinner("Generating AI analysis..."):
            report = analyze_with_gpt(username, md)
        st.markdown("## 🧠 GPT-4 Generated Insights")
        st.markdown(report)

Example Output

Here’s a sample response from the app for user torvalds:

GitHub Profile Analysis: torvalds

## 1. Professional Background
• Location: Portland, Oregon
• Works on the Linux kernel

## 2. Activity Analysis
• Top repository: linux
• Thousands of contributions
• Maintains core system software

## 3. Tech Stack
• C, C++, Shell scripting

## 4. Community Engagement
• Collaborates with hundreds of devs
• Active in pull request reviews

Use Cases

  • Hiring teams evaluating developer contributions
  • Candidates generating portfolio summaries
  • Open-source communities reviewing contributors
  • AI agents assessing technical depth of profiles

Next Steps

  • Add export to PDF
  • Add batch processing for multiple usernames
  • Deploy to Streamlit Cloud

Final Thoughts

This full-stack project combines web scraping, AI summarization, and interactive frontend to deliver structured GitHub insights. With Olostep + OpenAI + Streamlit, you can automate what once required hours of manual review.

Happy hacking! 🧙‍♂️