Automating Portfolio Report Generation with Python

Automatically generate and download all reports for a Portfolio

Overview

Generating SecurityScorecard reports for multiple companies within a portfolio can be a repetitive and resource-intensive task. This guide outlines how to automate this process using a Python script. We'll leverage the SecurityScorecard API to retrieve company data, initiate report generation, and package the results into a ZIP file, streamlining the acquisition of comprehensive portfolio reports.

This step-by-step guide will walk you through building and using this script, providing a quick start to working with the SecurityScorecard API.

Want to get right to it? Head to the 'Using The Script' section to download and learn how to use the script.

Building the Script

Prerequisites

Before running this script, ensure you have the following:

  • Python: Python version 3.x is required. You can download it from the official Python website (python.org).
  • Python Libraries: The script depends on the following Python libraries. You can install them using pip in your terminal (pip install [library_name])
    • requests: For making HTTP requests to the SecurityScorecard API.
    • zipfile: For creating and managing ZIP archives.
    • os: For interacting with the operating system (e.g., file paths).
    • datetime: For working with dates and times (e.g., for file naming).
  • SecurityScorecard API Key: You will need a valid SecurityScorecard API key to authenticate with the API. Obtain this from your SecurityScorecard account.
    • Note: You must have access to the SecurityScorecard API to make API calls.
  • SecurityScorecard Portfolio ID: You will also need the ID of the SecurityScorecard portfolio for which you want to generate reports.

Script Outline

This script automates the SecurityScorecard report generation workflow through these steps, leveraging the SecurityScorecard API:

  1. Portfolio Company Retrieval: The script begins by using the SecurityScorecard API to retrieve a list of all companies associated with a given portfolio ID.
  2. Generate Reports for each Company: For each company obtained, the script triggers the generation of summary reports using the API.
  3. Report Completion Check: The script then enters a loop to monitor the API's response and ensure all reports are ready for download.
  4. Report Download and Archiving: Finally, the script downloads the completed reports as PDF files and packages them into a single ZIP archive for easy access and distribution.

In the following sections, we'll walk through the code implementation for each step.

Step 1: Get Companies from your Portfolio

Begin by defining the get_portfolio_companies function. This function accepts the portfolio_id argument, which specifies the portfolio to retrieve companies from and generate reports for.

Function Definition

def get_portfolio_companies(portfolio_id):

API Endpoint and Headers:

Next we will start setting up the GET API call to get all companies from a portfolio. Construct the API endpoint URL, ensuring it includes the provided portfolio_id. Then, define the necessary request headers, including the Authorization header with your API token.

url = f"https://api.securityscorecard.io/portfolios/{portfolio_id}/companies" headers = { "Authorization": f"Token {api_token}", "accept": "application/json; charset=utf-8", }

You can test out the 'Get all companies in a portfolio' endpoint on our API Reference page.


Making the API Request:

Then we will execute the API call to retrieve the list of companies within the specified portfolio:

try: response = requests.get(url, headers=headers) response.raise_for_status() # Raise HTTPError for bad responses (4xx or 5xx)

Handling API Response:

After making the request, we need to handle both successful and failed API responses.

  • If the request is successful, we'll retrieve the entries in the portfolio and return them.
  • If the request fails, we'll throw an error and return none.
# Response successful: response_dict = response.json()["entries"] # Parse the JSON response print(f"Successfully retrieved Portfolio ID: {portfolio_id}") print(f"{len(response_dict)} Companies Retrieved in Portfolio.") return response_dict except requests.exceptions.RequestException as e: print(f"Error fetching portfolio companies: {e}") if response is not None: try: print(f"Response Body: {response.json()}") except json.JSONDecodeError: print(f"Response Text: {response.text}") return None except json.JSONDecodeError as e: print(f"Error decoding JSON response: {e}") if response is not None: print(f"Response Text: {response.text}") return None

Finished Function:

def get_portfolio_companies(portfolio_id): url = f"https://api.securityscorecard.io/portfolios/{portfolio_id}/companies" headers = { "Authorization": f"Token {api_token}", "accept": "application/json; charset=utf-8", } try: response = requests.get(url, headers=headers) response.raise_for_status() # Raise HTTPError for bad responses (4xx or 5xx) # Response successful: response_dict = response.json()["entries"] # Parse the JSON response print(f"Successfully retrieved Portfolio ID: {portfolio_id}") print(f"{len(response_dict)} Companies Retrieved in Portfolio.") return response_dict except requests.exceptions.RequestException as e: print(f"Error fetching portfolio companies: {e}") if response is not None: try: print(f"Response Body: {response.json()}") except json.JSONDecodeError: print(f"Response Text: {response.text}") return None except json.JSONDecodeError as e: print(f"Error decoding JSON response: {e}") if response is not None: print(f"Response Text: {response.text}") return None

Step 2: Generate summary reports for each company

Now that we have a list of companies, we can start requesting to generate a report for each of them.


Function Declaration & API Endpoint:

We will first create a function called generate_summary_reports, which will take two arguments: portfolio_companies and branding.

Then, we'll set up a POST API request to the reports/summary endpoint (that's where we tell the API to generate the Summary Report).

def generate_summary_reports(portfolio_companies, branding): url = "https://api.securityscorecard.io/reports/summary" headers = { "Authorization": f"Token {api_token}", "accept": "application/json; charset=utf-8", "content-type": "application/json", }

API Request and Response

We'll create a dictionary called results to store the API responses, which will give us the details for each generated report. This will give us the details of the report we requested to generate.

After that, we'll loop through the list of companies and make a POST request to the reports/summary API endpoint for each one, including the company domain and branding choices in the payload.

results = {} for company in portfolio_companies: company_domain = company["domain"] payload = { "scorecard_identifier": company_domain, "branding": branding } try: response = requests.post(url, headers=headers, json=payload) response.raise_for_status() results[company_domain] = response.json() print(f"Successfully requested report generation for: {company_domain}")

Finally, we'll handle any errors that occur during the API calls, just like we did when getting the companies in Step 1.

View and try out the 'Generate a Company Summary report' API endpoint on our API Reference page.


Finished Function:

def generate_summary_reports(portfolio_companies, branding): url = "https://api.securityscorecard.io/reports/summary" headers = { "Authorization": f"Token {api_token}", "accept": "application/json; charset=utf-8", "content-type": "application/json", } results = {} for company in portfolio_companies: company_domain = company["domain"] payload = { "scorecard_identifier": company_domain, "branding": branding } try: response = requests.post(url, headers=headers, json=payload) response.raise_for_status() results[company_domain] = response.json() print(f"Successfully requested report generation for: {company_domain}") except requests.exceptions.RequestException as e: print(f"Error generating report for {company_domain}: {e}") results[company_domain] = None if response is not None: try: print(f"Response Body: {response.json()}") except json.JSONDecodeError: print(f"Response Text: {response.text}") except json.JSONDecodeError as e: print(f"Error decoding JSON response for {company_domain}: {e}") results[company_domain] = None if response is not None: print(f"Response Text: {response.text}") print(f"{len(results)} Reports successfully requested for generation.") return results

Step 3: Wait for the reports to complete

In this step, we want to check if our reports have finished generating. To do this, we can retrieve our recently completed reports from the reports/recent endpoint and compare them to our requested reports.

Each report has a unique ID that is created when you request to generate it. We can verify if a report is finished by comparing the report IDs of the recent reports to the report confirmations we generated in the last step.

We will start by creating a dictionary called report_confirmation_ids. This will extract the report IDs from our report confirmations. We will map these IDs to the corresponding company domain so we know what company goes with which report.

report_confirmation_ids = {} for domain, report_info in report_confirmations.items(): report_id = report_info["id"] report_confirmation_ids[report_id] = domain

Next, we'll set up an API request to the reports/recent endpoint to fetch recently completed reports.

# Set up API Request url = "https://api.securityscorecard.io/reports/recent" headers = { "Authorization": f"Token {api_token}", "accept": "application/json" }

With our IDs and API request prepared, we can now execute the API call and determine if report generation is complete.

To achieve this, we'll make a call to the reports/recent endpoint and verify that all report IDs stored in report_confirmation_ids are found within the API response. Once all reports have finished and are matched, we'll return a list containing the corresponding report data from the API.

If not all reports are ready, we'll implement a delay before sending another API request to re-check the recently completed report list.

matched_receipts = set() matched_report_data = [] # List to store data for matched reports start_time = time.time() timeout = 60 * 60 # 1 hour timeout. # Polls the API to match generated reports with requested reports and retrieve their details. while time.time() - start_time < timeout: try: response = requests.get(url, headers=headers) response.raise_for_status() data = response.json() # Iterates through API report data, identifying and storing matched reports. for report in data.get("entries", []): receipt_id = report.get("id") download_url = report.get("download_url") if receipt_id and receipt_id in report_confirmation_ids and download_url: matched_receipts.add(receipt_id) matched_report_data.append(report) # Checks if all keys of report_confirmation_ids are matched if set(report_confirmation_ids.keys()) == matched_receipts: print(f"{len(report_confirmation_ids)} Reports Successfully Finished Generating") return matched_report_data # Get the domains that are still waiting: waiting_domains = [report_confirmation_ids[receipt_id] for receipt_id in report_confirmation_ids.keys() if receipt_id not in matched_receipts] print(f"Matched {len(matched_receipts)} of {len(report_confirmation_ids)} receipts. Checking again in 1 minute...") print(f"Waiting on domains: {waiting_domains}") matched_report_data = [] # clear reports for next fetch time.sleep(60) # Wait for 1 minute

View the endpoint details and response behavior on our API Reference page.

Similar to previous steps, we'll handle API call errors.


Finished Function:

def wait_for_completed_reports(report_confirmations): # Extracts the "id" values and their corresponding domain names from a dictionary representing report data, and returns a dictionary mapping IDs to domains. report_confirmation_ids = {} for domain, report_info in report_confirmations.items(): report_id = report_info["id"] report_confirmation_ids[report_id] = domain # Set up API Request url = "https://api.securityscorecard.io/reports/recent" headers = { "Authorization": f"Token {api_token}", "accept": "application/json" } matched_receipts = set() matched_report_data = [] # List to store data for matched reports start_time = time.time() timeout = 60 * 60 # 1 hour timeout. # Polls the API to match generated reports with requested reports and retrieve their details. while time.time() - start_time < timeout: try: response = requests.get(url, headers=headers) response.raise_for_status() data = response.json() # Iterates through API report data, identifying and storing matched reports. for report in data.get("entries", []): receipt_id = report.get("id") download_url = report.get("download_url") if receipt_id and receipt_id in report_confirmation_ids and download_url: matched_receipts.add(receipt_id) matched_report_data.append(report) # Checks if all keys of report_confirmation_ids are matched if set(report_confirmation_ids.keys()) == matched_receipts: print(f"{len(report_confirmation_ids)} Reports Successfully Finished Generating") return matched_report_data # Get the domains that are still waiting: waiting_domains = [report_confirmation_ids[receipt_id] for receipt_id in report_confirmation_ids.keys() if receipt_id not in matched_receipts] print(f"Matched {len(matched_receipts)} of {len(report_confirmation_ids)} receipts. Checking again in 1 minute...") print(f"Waiting on domains: {waiting_domains}") matched_report_data = [] # clear reports for next fetch time.sleep(60) # Wait for 1 minute except requests.exceptions.RequestException as e: print(f"Error fetching recent reports: {e}") if response is not None: try: print(f"Response Body: {response.json()}") except json.JSONDecodeError: print(f"Response Text: {response.text}") return None except json.JSONDecodeError as e: print(f"Error decoding JSON response: {e}") if response is not None: print(f"Response Text: {response.text}") return None print("Timeout reached. Not all report receipts were matched.") return None

Step 4: Download reports into ZIP file

With the completed reports available, we'll download them into a new ZIP file. This involves looping through the completed reports and using each report's download_url (or data_download_url for JSON) to request the corresponding PDF.

When downloading our report, we can optionally specify a language and the desired report format (JSON or PDF).

Finished Function:

def download_and_zip_reports(report_data, report_format, zip_filename, language): # Get the current date in YYYY-MM-DD format current_date = datetime.date.today().strftime("%Y-%m-%d") zip_filename_with_date = f"{zip_filename}-{current_date}.zip" try: with zipfile.ZipFile(zip_filename_with_date, 'w') as zipf: print(f"{len(report_data)} reports to be downloaded.") for report in report_data: try: # Determine the correct download URL based on report_format if report_format == "json": download_url = report.get("data_download_url") else: # Default to "pdf" download_url = report.get("download_url") if not download_url: print(f"Skipping report: Missing 'download_url' or 'data_download_url'") continue # Skip to the next report if download_url is missing headers = {"Authorization": f"Bearer {api_token}"} response = requests.get((f"{download_url}?lng={language}" if language != "default" else download_url), headers=headers, stream=True) response.raise_for_status() # Check for HTTP errors # Extract the filename from the download_url (or use a default) filename = os.path.basename(download_url) if not filename: filename = f"report_{report_data.index(report)}.{report_format.lower()}" # Default name # Write the PDF content to the zip file zipf.writestr(filename, response.content) print(f"Downloaded and added: {filename}") except requests.exceptions.RequestException as e: print(f"Error downloading {download_url}: {e}") except Exception as e: print(f"An unexpected error occurred: {e}") print(f"Successfully created zip file: {zip_filename_with_date}") except Exception as e: print(f"Error creating zip file: {e}")

Tying it all together

With our functions defined for each step, we can now integrate them into a complete workflow. We'll achieve this by passing the output of each function as input to the subsequent function.

# Add your API token and portfolio ID here api_token = "" portfolio_id = "" # Optional parameters. Can leave variables as is if no personalization is desired. report_format = "pdf" branding = "company" file_name = "downloaded-reports" language = "default" def main(): # Get all companies in a Portfolio portfolio_companies = get_portfolio_companies(portfolio_id) # Generate summary reports for each company in the Portfolio. (Can specify branding choice here) report_confirmations = generate_summary_reports(portfolio_companies, branding) # Check to see if reports are completed completed_reports = wait_for_completed_reports(report_confirmations) # Download all PDFs to a ZIP file download_and_zip_reports(completed_reports, report_format, file_name, language)

And with that, our script is complete and ready to run.


Using the Script

You can download the finished script on our GitHub Developer Community here.

Running the Script

To execute the script, open your terminal and navigate to the directory where the script is saved. Then, you can run it using the command python generate_ssc_reports.py or python3 generate_ssc_reports.py. The script will start, and you'll see its progress printed to the terminal. After it finishes, the ZIP file will be in the same directory as your script.

To run the script, open your command-line interface and navigate to the directory where the script is saved.

  • On macOS or Linux: Use the command python generate_ssc_reports.py or python3 generate_ssc_reports.py
  • On Windows: Use the command python generate_ssc_reports.py

The script will start, and you'll see its progress printed to the command-line interface. After it finishes, the ZIP file will be in the same directory as your script.

Running Script with GUI

For users who want to use this tool without editing the code, we've also developed a user-friendly version with a graphical user interface (GUI). This version allows you to input your API key, Portfolio ID, and other options through the GUI.

To launch the GUI, open your command-line interface and type python generate_ssc_reports_gui.py or python3 generate_ssc_reports_gui.py

This will open the GUI, where you can easily input your API key, Portfolio ID, and other settings to generate your summary reports."

Continue Building

This script is just the beginning. Build upon it to create powerful, custom automation for your organization. Here are some ideas to explore:

  • Go beyond summary reports and generate Company Detailed, Company Events, or Company Issues reports.
  • Integrate third-party vendor companies into your report generation workflow.
  • Implement a monthly automation to generate reports automatically.
  • Develop an internal tool that enables teammates to self-serve report requests.

Did this page help you?