URL Classification


Discovered Intelligence - Dissemination

  • Phishing is a threat that impacts the entire financial/banking industry, so any research done and/or tools created by a company should be shared amongst all companies to help work towards minimizing these threats.
  • As a phishing attack on any employee can impact a whole organization, all employees should be made aware of phishing threats and how to detect them.
  • Through the findings, guides can be created that are shared amongst all employees of companies outlining common trends/patterns to look for in links before clicking.
  • With malicious URLs being everchanging, employees should report any suspicions from previously clicked links and/or anytime they receive a potentially suspicious link.

Discovered Intelligence - Courses of Action

  • To help protect organizations as a whole, companies can:
    • Implement 3rd-party (and/or develop in-house) tools/software that scan links prior to being opened and/or loading the page.
    • Create stricter criteria for email filters.
  • To help ensure employees remain vigilant to phishing threats, companies can implement internally planned/safe phishing exercises.
    • Those who report or don't click the links are rewarded whereas those who do click are assigned additional training.
    • This helps companies monitor awareness of these threats among their employees and adjust trainings if/where necessary.

Discovered Intelligence - Iterative Changes

  • With the dataset for non-malicious URLs being limited to 500 URLs, we would like to find a source that provides more data to create a larger dataset and even out the balance between malicious vs. non-malicious URLs.
  • While the datasets provided are more generalized URLs (vs. industry specific), we would like to find datasets that are specific to the financial industry.
    • This will help provide more detail to the analysis and potentially uncover additional characteristics/themes in the URLs.
  • Ensure the data from both sources is similar and not missing certain characteristics.
    • It was noticed that not all the URLs provided by Moz.com had HTTP/HTTPS and www present which creates potential discrepancies.
    • Additionally, URLs provided by Moz.com did not contain URL parameters.