Back

Oil Trading Data Scraping & Automation

1 of 9

Oil Trading Data Scraping & Automation

Automated oil market data collection and processing with web scraping, ETL, and Excel reporting.

July 2023 - Dec 2023
OQ Trading
Oil Trading Data Scraping & Automation

Project Overview

Developed an end-to-end automated data pipeline for OQ Trading to streamline oil market intelligence. Leveraging Selenium, I built scrapers to collect production, import/export, and intake data from multiple government portals worldwide (China, Ecuador, Thailand, Brazil, etc.). Designed a unified data model and applied Pandas for cleaning and transformation to ensure accuracy and consistency.

Key Features

  • Developed a Selenium-based web scraper to collect oil production, import/export, and intake data from government websites across multiple countries (China, Ecuador, Thailand, Brazil, etc.).
  • Automated data extraction from structured and unstructured sources, ensuring accurate updates.
  • Standardized & Unified Data Processing: Designed a unified data model to standardize oil trade data from multiple sources. Used Pandas for data transformation, cleaning, and manipulation to ensure consistency.
  • Excel Data Automation & Pivot Table Processing
  • Automated the filtering and processing of Excel reports, including pivot tables, to extract relevant data fields. Built a data pipeline that dynamically selects appropriate fields based on business requirements.

Challenges & Solutions

The main challenge was collecting oil trade data from multiple government portals with inconsistent formats, frequent structural changes, and in some cases unstructured data sources. To address this, I built a resilient Selenium-based scraping framework with dynamic selectors and error handling to adapt to changes in website structures. Another challenge was consolidating heterogeneous datasets into a unified format suitable for analysis. I solved this by designing a standardized data model and implementing robust cleaning and transformation processes with Pandas, ensuring accuracy, consistency, and reliability across all reports.

Outcome & Impact

The automation eliminated manual data collection tasks, significantly reducing the cost of data collection while ensuring faster and more reliable access to standardized oil market data. The resulting dataset was also used to train machine learning algorithms, providing a foundation for predictive modeling and more accurate trading strategies.

Technologies Used

PythonPandasSeleniumWeb ScrapingExcel AutomationData Cleaning

All Projects