Skip to content

alek-pol/location_codes_scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Location Codes Scraper

This is a command-line Ruby script that scrapes the UNECE website to extract UN/LOCODEs (United Nations Codes for Trade and Transport Locations) for countries and territories. The script uses the Kimurai framework for web scraping.

Features

  • Scrapes data from the UNECE website for UN/LOCODEs.
  • Outputs results to a CSV file.
  • Supports appending data to an existing CSV file.
  • Can run in test mode without actual scraping or saving data.
  • Includes detailed logging for debugging and additional information.
  • Performs validation checks on the data while scraping tables.

Requirements

  • Ruby 2.6 or higher.
  • The following Ruby gems:
    • kimurai (version 1.4.0)

Installation & Usage

1. Install Ruby and Dependencies

Make sure you have Ruby installed on your system. You can check if Ruby is installed by running:

ruby --version

If Ruby is not installed, download and install Ruby.

Then, install the required dependencies by running:

gem install kimurai

2. Download the Script

Clone the repository or download the script file location_codes_scraper.rb from the repository.

git clone https://github.com/your-repo/location-codes-scraper.git
cd location-codes-scraper

Alternatively, if you're just using the script file, download it and place it in your desired directory.

3. Running the Script

Once the script is downloaded and dependencies are installed, you can run it with various options.

ruby location_codes_scraper.rb [options]

Example Commands:

  • Run the script in test mode (no data saved, allowing you to verify the availability and structure of the source tables on the UNECE website):

    ruby location_codes_scraper.rb --test

These revisions enhance readability while preserving the original meaning.

  • Run with detailed logging:

    ruby location_codes_scraper.rb --verbose
  • Save the results to a specific file:

    ruby location_codes_scraper.rb --path /path/to/output.csv
  • Append results to an existing file:

    ruby location_codes_scraper.rb --append --path /path/to/output.csv

4. (Optional) Automate Script Execution

You can automate the script execution by adding it to a cron job (on Linux/Mac) or Task Scheduler (on Windows) if you want to run the script regularly.

Notes

  • The --path argument specifies the file where the results will be saved. By default, this is location_codes.csv.
  • The script checks and processes tables on the UNECE website. If the structure of the tables changes, an error message will be displayed.

License

This script is open-source and licensed under the MIT License.

About

A command-line Ruby script that scrapes the UNECE website to extract UNLOCODEs

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages